I have a Python list and I want to know what's the quickest way to count the number of occurrences of the item, '1'
in this list. In my actual case, the item can occur tens of thousands of times which is why I want a fast way.
['1', '1', '1', '1', '1', '1', '2', '2', '2', '2', '7', '7', '7', '10', '10']
Which approach: .count
or collections.Counter
is likely more optimized?
a = ['1', '1', '1', '1', '1', '1', '2', '2', '2', '2', '7', '7', '7', '10', '10']
print a.count("1")
It's probably optimized heavily at the C level.
Edit: I randomly generated a large list.
In [8]: len(a)
Out[8]: 6339347
In [9]: %timeit a.count("1")
10 loops, best of 3: 86.4 ms per loop
Edit edit: This could be done with collections.Counter
a = Counter(your_list)
print a['1']
Using the same list in my last timing example
In [17]: %timeit Counter(a)['1']
1 loops, best of 3: 1.52 s per loop
My timing is simplistic and conditional on many different factors, but it gives you a good clue as to performance.
Here is some profiling
In [24]: profile.run("a.count('1')")
3 function calls in 0.091 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 0.091 0.091 <string>:1(<module>)
1 0.091 0.091 0.091 0.091 {method 'count' of 'list' objects}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Prof
iler' objects}
In [25]: profile.run("b = Counter(a); b['1']")
6339356 function calls in 2.143 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 2.143 2.143 <string>:1(<module>)
2 0.000 0.000 0.000 0.000 _weakrefset.py:68(__contains__)
1 0.000 0.000 0.000 0.000 abc.py:128(__instancecheck__)
1 0.000 0.000 2.143 2.143 collections.py:407(__init__)
1 1.788 1.788 2.143 2.143 collections.py:470(update)
1 0.000 0.000 0.000 0.000 {getattr}
1 0.000 0.000 0.000 0.000 {isinstance}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Prof
iler' objects}
6339347 0.356 0.000 0.356 0.000 {method 'get' of 'dict' objects}