Edit 2: It was suggested that this is a copy of a similar question. I'd disagree since my question focuses on speed, while the other question asks what is more "readable" or "better" (without defining better). While the questions are similar, there is a big difference in the discussion/answers given.
EDIT: I realise from the questions that I could have been clearer. Sorry for code typos, yes it should be using the proper python operator for addition.
Regarding the input data, I just chose a list of random numbers since that's a common sample. In my case I'm using a dict where I expect a lot of keyerrors, probably 95% of the keys will not exist, and the few that exist will contain clusters of data.
I'm interested in a general discussion though, regardless of the input data set, but of course samples with running times are interesting.
My standard approach would be like so many other posts to write something like
list = (100 random numbers)
d = {}
for x in list:
if x in d:
d[x]+=1
else:
d[x]=1
But I just came to think of this being faster, since we dont have to check if the dictionary contains the key. We just assume it does, and if not, we handle that. Is there any difference or is Python smarter than I am?
list = (100 random numbers)
d = {}
for x in list:
try:
d[x]+=1
except KeyError:
d[x] = 1
The same approach with indexes in an array, out of bounds, negative indexes etc.
Your claim is absolutely false depends on the input.
If you have a diverse set of keys, and hits the except
block often, the performance is not good. If the try
block is dominant the try/except
idiom can be performant on smaller lists.
Here is a benchmark showing several ways to do the same thing:
from __future__ import print_function
import timeit
import random
import collections
def f1():
d={}
for x in tgt:
if x in d:
d[x]+=1
else:
d[x]=1
return d
def f2():
d = {}
for x in tgt:
try:
d[x]+=1
except KeyError:
d[x] = 1
return d
def f3():
d={}.fromkeys(tgt, 0)
for x in tgt:
d[x]+=1
return d
def f4():
d=collections.defaultdict(int)
for x in tgt:
d[x]+=1
return d
def f5():
return collections.Counter(tgt)
def f6():
d={}
for x in tgt:
d[x]=d.setdefault(x, 0)+1
return d
def f7():
d={}
for x in tgt:
d[x]=d.get(x,0)+1
return d
def cmpthese(funcs, c=10000, rate=True, micro=False):
"""Generate a Perl style function benchmark"""
def pprint_table(table):
"""Perl style table output"""
def format_field(field, fmt='{:,.0f}'):
if type(field) is str: return field
if type(field) is tuple: return field[1].format(field[0])
return fmt.format(field)
def get_max_col_w(table, index):
return max([len(format_field(row[index])) for row in table])
col_paddings=[get_max_col_w(table, i) for i in range(len(table[0]))]
for i,row in enumerate(table):
# left col
row_tab=[row[0].ljust(col_paddings[0])]
# rest of the cols
row_tab+=[format_field(row[j]).rjust(col_paddings[j]) for j in range(1,len(row))]
print(' '.join(row_tab))
results={k.__name__:timeit.Timer(k).timeit(c) for k in funcs}
fastest=sorted(results,key=results.get, reverse=True)
table=[['']]
if rate: table[0].append('rate/sec')
if micro: table[0].append('usec/pass')
table[0].extend(fastest)
for e in fastest:
tmp=[e]
if rate:
tmp.append('{:,}'.format(int(round(float(c)/results[e]))))
if micro:
tmp.append('{:.3f}'.format(1000000*results[e]/float(c)))
for x in fastest:
if x==e: tmp.append('--')
else: tmp.append('{:.1%}'.format((results[x]-results[e])/results[e]))
table.append(tmp)
pprint_table(table)
if __name__=='__main__':
import sys
print(sys.version)
for j in [100,1000]:
for t in [(0,5), (0,50), (0,500)]:
tgt=[random.randint(*t) for i in range(j)]
print('{} rand ints between {}:'.format(j,t))
print('=====')
cmpthese([f1,f2,f3,f4,f5,f6,f7])
print()
I have included a small benchmark function based on timeit
that prints the functions from Slowest to Fastest with a percent difference between them.
Here is the results for Python 3:
3.4.1 (default, May 19 2014, 13:10:29)
[GCC 4.2.1 Compatible Apple LLVM 5.1 (clang-503.0.40)]
100 rand ints between (0, 5):
=====
rate/sec f6 f7 f1 f2 f3 f4 f5
f6 52,756 -- -1.6% -26.2% -27.9% -30.7% -36.7% -46.8%
f7 53,624 1.6% -- -25.0% -26.7% -29.6% -35.7% -46.0%
f1 71,491 35.5% 33.3% -- -2.3% -6.1% -14.2% -28.0%
f2 73,164 38.7% 36.4% 2.3% -- -3.9% -12.2% -26.3%
f3 76,148 44.3% 42.0% 6.5% 4.1% -- -8.7% -23.3%
f4 83,368 58.0% 55.5% 16.6% 13.9% 9.5% -- -16.0%
f5 99,247 88.1% 85.1% 38.8% 35.6% 30.3% 19.0% --
100 rand ints between (0, 50):
=====
rate/sec f2 f6 f7 f4 f3 f1 f5
f2 39,405 -- -17.9% -18.7% -19.1% -41.8% -47.8% -56.3%
f6 47,980 21.8% -- -1.1% -1.6% -29.1% -36.5% -46.8%
f7 48,491 23.1% 1.1% -- -0.5% -28.4% -35.8% -46.2%
f4 48,737 23.7% 1.6% 0.5% -- -28.0% -35.5% -46.0%
f3 67,678 71.7% 41.1% 39.6% 38.9% -- -10.4% -24.9%
f1 75,511 91.6% 57.4% 55.7% 54.9% 11.6% -- -16.3%
f5 90,175 128.8% 87.9% 86.0% 85.0% 33.2% 19.4% --
100 rand ints between (0, 500):
=====
rate/sec f2 f4 f6 f7 f3 f1 f5
f2 25,748 -- -22.0% -41.4% -42.6% -57.5% -66.2% -67.8%
f4 32,996 28.1% -- -24.9% -26.4% -45.6% -56.7% -58.8%
f6 43,930 70.6% 33.1% -- -2.0% -27.5% -42.4% -45.1%
f7 44,823 74.1% 35.8% 2.0% -- -26.1% -41.2% -44.0%
f3 60,624 135.5% 83.7% 38.0% 35.3% -- -20.5% -24.2%
f1 76,244 196.1% 131.1% 73.6% 70.1% 25.8% -- -4.7%
f5 80,026 210.8% 142.5% 82.2% 78.5% 32.0% 5.0% --
1000 rand ints between (0, 5):
=====
rate/sec f7 f6 f1 f3 f2 f4 f5
f7 4,993 -- -6.7% -34.6% -39.4% -44.4% -50.1% -71.1%
f6 5,353 7.2% -- -29.9% -35.0% -40.4% -46.5% -69.0%
f1 7,640 53.0% 42.7% -- -7.3% -14.9% -23.6% -55.8%
f3 8,242 65.1% 54.0% 7.9% -- -8.2% -17.6% -52.3%
f2 8,982 79.9% 67.8% 17.6% 9.0% -- -10.2% -48.1%
f4 10,004 100.4% 86.9% 30.9% 21.4% 11.4% -- -42.1%
f5 17,293 246.4% 223.0% 126.3% 109.8% 92.5% 72.9% --
1000 rand ints between (0, 50):
=====
rate/sec f7 f6 f1 f2 f3 f4 f5
f7 5,051 -- -7.1% -26.5% -29.0% -34.1% -45.7% -71.2%
f6 5,435 7.6% -- -20.9% -23.6% -29.1% -41.5% -69.0%
f1 6,873 36.1% 26.5% -- -3.4% -10.3% -26.1% -60.8%
f2 7,118 40.9% 31.0% 3.6% -- -7.1% -23.4% -59.4%
f3 7,661 51.7% 41.0% 11.5% 7.6% -- -17.6% -56.3%
f4 9,297 84.0% 71.1% 35.3% 30.6% 21.3% -- -47.0%
f5 17,531 247.1% 222.6% 155.1% 146.3% 128.8% 88.6% --
1000 rand ints between (0, 500):
=====
rate/sec f2 f4 f6 f7 f3 f1 f5
f2 3,985 -- -11.0% -13.6% -14.8% -25.7% -40.4% -66.9%
f4 4,479 12.4% -- -2.9% -4.3% -16.5% -33.0% -62.8%
f6 4,613 15.8% 3.0% -- -1.4% -14.0% -31.0% -61.6%
f7 4,680 17.4% 4.5% 1.4% -- -12.7% -30.0% -61.1%
f3 5,361 34.5% 19.7% 16.2% 14.6% -- -19.8% -55.4%
f1 6,683 67.7% 49.2% 44.9% 42.8% 24.6% -- -44.4%
f5 12,028 201.8% 168.6% 160.7% 157.0% 124.3% 80.0% --
And Python 2:
2.7.6 (default, Dec 1 2013, 13:26:15)
[GCC 4.2.1 Compatible Apple LLVM 5.0 (clang-500.2.79)]
100 rand ints between (0, 5):
=====
rate/sec f5 f7 f6 f2 f1 f3 f4
f5 24,955 -- -41.8% -42.5% -51.3% -55.7% -61.6% -65.2%
f7 42,867 71.8% -- -1.2% -16.4% -23.9% -34.0% -40.2%
f6 43,382 73.8% 1.2% -- -15.4% -23.0% -33.2% -39.5%
f2 51,293 105.5% 19.7% 18.2% -- -9.0% -21.0% -28.5%
f1 56,357 125.8% 31.5% 29.9% 9.9% -- -13.2% -21.4%
f3 64,924 160.2% 51.5% 49.7% 26.6% 15.2% -- -9.5%
f4 71,709 187.3% 67.3% 65.3% 39.8% 27.2% 10.5% --
100 rand ints between (0, 50):
=====
rate/sec f2 f5 f7 f6 f4 f3 f1
f2 22,439 -- -4.7% -45.1% -45.5% -50.7% -63.3% -64.5%
f5 23,553 5.0% -- -42.4% -42.8% -48.3% -61.5% -62.8%
f7 40,878 82.2% 73.6% -- -0.7% -10.2% -33.2% -35.4%
f6 41,164 83.4% 74.8% 0.7% -- -9.6% -32.7% -34.9%
f4 45,525 102.9% 93.3% 11.4% 10.6% -- -25.6% -28.0%
f3 61,167 172.6% 159.7% 49.6% 48.6% 34.4% -- -3.3%
f1 63,261 181.9% 168.6% 54.8% 53.7% 39.0% 3.4% --
100 rand ints between (0, 500):
=====
rate/sec f2 f5 f4 f6 f7 f3 f1
f2 13,122 -- -39.9% -56.2% -63.2% -63.8% -75.8% -80.0%
f5 21,837 66.4% -- -27.1% -38.7% -39.8% -59.6% -66.7%
f4 29,945 128.2% 37.1% -- -16.0% -17.4% -44.7% -54.3%
f6 35,633 171.6% 63.2% 19.0% -- -1.7% -34.2% -45.7%
f7 36,257 176.3% 66.0% 21.1% 1.8% -- -33.0% -44.7%
f3 54,113 312.4% 147.8% 80.7% 51.9% 49.2% -- -17.5%
f1 65,570 399.7% 200.3% 119.0% 84.0% 80.8% 21.2% --
1000 rand ints between (0, 5):
=====
rate/sec f5 f7 f6 f1 f2 f3 f4
f5 2,787 -- -37.7% -38.4% -53.3% -59.9% -60.4% -67.0%
f7 4,477 60.6% -- -1.1% -25.0% -35.6% -36.3% -47.0%
f6 4,524 62.3% 1.1% -- -24.2% -34.9% -35.6% -46.5%
f1 5,972 114.3% 33.4% 32.0% -- -14.1% -15.0% -29.3%
f2 6,953 149.5% 55.3% 53.7% 16.4% -- -1.1% -17.7%
f3 7,030 152.2% 57.0% 55.4% 17.7% 1.1% -- -16.8%
f4 8,452 203.3% 88.8% 86.8% 41.5% 21.6% 20.2% --
1000 rand ints between (0, 50):
=====
rate/sec f5 f7 f6 f2 f1 f3 f4
f5 2,667 -- -37.8% -38.7% -53.0% -55.9% -61.1% -65.3%
f7 4,286 60.7% -- -1.5% -24.5% -29.1% -37.5% -44.2%
f6 4,351 63.1% 1.5% -- -23.4% -28.0% -36.6% -43.4%
f2 5,677 112.8% 32.4% 30.5% -- -6.1% -17.3% -26.1%
f1 6,045 126.6% 41.0% 39.0% 6.5% -- -11.9% -21.4%
f3 6,862 157.3% 60.1% 57.7% 20.9% 13.5% -- -10.7%
f4 7,687 188.2% 79.3% 76.7% 35.4% 27.2% 12.0% --
1000 rand ints between (0, 500):
=====
rate/sec f2 f5 f7 f6 f4 f3 f1
f2 2,018 -- -16.1% -44.1% -46.2% -53.4% -61.8% -63.0%
f5 2,405 19.1% -- -33.4% -35.9% -44.5% -54.4% -55.9%
f7 3,609 78.8% 50.1% -- -3.8% -16.7% -31.6% -33.8%
f6 3,753 85.9% 56.1% 4.0% -- -13.4% -28.9% -31.2%
f4 4,334 114.7% 80.2% 20.1% 15.5% -- -17.9% -20.5%
f3 5,277 161.5% 119.5% 46.2% 40.6% 21.8% -- -3.2%
f1 5,454 170.2% 126.8% 51.1% 45.3% 25.8% 3.3% --
So -- it depends.
Conclusions:
Counter
method is almost always among the slowestCounter
method is among the slowest on Python 2 but by far the fastest on Python 3.4try/except
version is usually among the slowestif key in dict
version is predictably one of the best/fastest regardless of the size or key count{}.fromkeys(tgt, 0)
is very predictable defaultdict
version is fastest on larger lists. Smaller lists the longer setup time is amortized over too few elements.