Find count of characters within the string in Python

Chiyaan Suraj picture Chiyaan Suraj · Dec 3, 2016 · Viewed 8.3k times · Source

I am trying to create a dictionary of word and number of times it is repeating in string. Say suppose if string is like below

str1 = "aabbaba"

I want to create a dictionary like this

word_count = {'a':4,'b':3}

I am trying to use dictionary comprehension to do this. I did

dic = {x:dic[x]+1 if x in dic.keys() else x:1 for x in str}

This ends up giving an error saying

  File "<stdin>", line 1
    dic = {x:dic[x]+1 if x in dic.keys() else x:1 for x in str}
                                               ^
SyntaxError: invalid syntax

Can anybody tell me what's wrong with the syntax? Also,How can I create such a dictionary using dictionary comprehension?

Answer

dawg picture dawg · Dec 3, 2016

As others have said, this is best done with a Counter.

You can also do:

>>> {e:str1.count(e) for e in set(str1)}
{'a': 4, 'b': 3}

But that traverses the string 1+n times for each unique character (once to create the set, and once for each unique letter to count the number of times it appears. i.e., This has quadratic runtime complexity.). Bad result if you have a lot of unique characters in a long string... A Counter only traverses the string once.

If you want no import version that is more efficient than using .count, you can use .setdefault to make a counter:

>>> count={}
>>> for c in str1:
...    count[c]=count.setdefault(c, 0)+1
... 
>>> count
{'a': 4, 'b': 3}

That only traverses the string once no matter how long or how many unique characters.


You can also use defaultdict if you prefer:

>>> from collections import defaultdict
>>> count=defaultdict(int)
>>> for c in str1:
...    count[c]+=1
... 
>>> count
defaultdict(<type 'int'>, {'a': 4, 'b': 3})
>>> dict(count)
{'a': 4, 'b': 3}

But if you are going to import collections -- Use a Counter!