Check if specific characters are in a string

Aitana Ezkibel picture Aitana Ezkibel · Jul 11, 2017 · Viewed 7.8k times · Source

I need to find and count how many characters can be found in a string. I have divided the characters into chars1[a:m] and chars2[n:z], and have two counters.

The output should be 0/14, but it is 0/1 instead. I think it only checks to see if one and only one item is contained and then exits out the loop. Is that the case?

Here is the code.

string_1 = "aaabbbbhaijjjm"

def error_printer(s):
    chars1 = "abcdefghijklm"
    chars2 = "nopqrstuvwxyz"
    counter1 = 0
    counter2 = 0

    if ((c in s) for c in chars1):
        counter1 += 1
    elif ((c in s) for c in chars2):
        counter2 += 1
    print(str(counter2) + "/" + str(counter1))

error_printer(string_1)

Answer

Willem Van Onsem picture Willem Van Onsem · Jul 11, 2017

Number of characters in chars1/chars2 that occur in s

That makes sense since you increment with an if condition. Since the if is not in a loop, you can increment it once.

Now we can unfold the generator into a for loop. This will solve one part of the problem and generate 0/6:

for c in chars1:
    if c in s:
        counter1 += 1
for c in chars2:
    if c in s:
        counter2 += 1

Nevertheless, this still will not be terribly efficient: it requires O(n) worst case to check if a character is in a string. You can construct a set first with the characters in the string, and then perform lookups (which are usually O(1) on average case:

def error_printer(s):
    sset = set(s)
    chars1 = "abcdefghijklm"
    chars2 = "nopqrstuvwxyz"
    counter1 = 0
    counter2 = 0
    for c in chars1:
        if c in sset:
            counter1 += 1
    for c in chars2:
        if c in sset:
            counter2 += 1
    print(str(counter2) + "/" + str(counter1))

Now we have improved the efficiency, but it is still not very elegantly: it takes a lot of code, and furthermore one has to inspect the code in order to know what it does. We can use a sum(..) construct to calculate the number of elements that satisfy a certain constraint like:

def error_printer(s):
    sset = set(s)
    chars1 = "abcdefghijklm"
    chars2 = "nopqrstuvwxyz"
    counter1 = sum(c in sset for c in chars1)
    counter2 = sum(c in sset for c in chars2)
    print(str(counter2) + "/" + str(counter1))

This produces 0/6 since there are six characters in the [A-M] range that occur in s and 0 in the [N-Z] range that occur in s.

Number of characters in s that occur in char1/char2

Based on the body of the question however, you want to count the number of characters in s that occur in the two different ranges.

An other related problem is counting the number of characters that occur in char1/char2. In that case we simply have to swap the loops:

def error_printer(s):
    chars1 = set("abcdefghijklm")
    chars2 = set("nopqrstuvwxyz")
    counter1 = sum(c in chars1 for c in s)
    counter2 = sum(c in chars2 for c in s)
    print(str(counter2) + "/" + str(counter1))

This produces 0/14 since there are 14 characters in s that occur in the [A-M] range (if 'a' occurs twice in s, then we count it twice), and none of the characters in s occur in the [N-Z] range.

Using range checks

Since we are working with ranges, we can use comparisons instead of element checks, and make it run with two comparison checks, like:

def error_printer(s):
    counter1 = sum('a' <= c <= 'm' for c in s)
    counter2 = sum('n' <= c <= 'z' for c in s)
    print(str(counter2) + "/" + str(counter1))