Surprising results with Python timeit: Counter() vs defaultdict() vs dict()

BlueTrin picture BlueTrin · Jan 6, 2015 · Viewed 7.5k times · Source

I obtained very surprising results with timeit, can someone tell me if I am doing something wrong ? I am using Python 2.7.

This is the contents of file speedtest_init.py:

import random

to_count = [random.randint(0, 100) for r in range(60)]

These are the contents of speedtest.py:

__author__ = 'BlueTrin'

import timeit

def test_init1():
    print(timeit.timeit('import speedtest_init'))

def test_counter1():
    s = """\
    d = defaultdict(int);
    for i in speedtest_init.to_count:
        d[i] += 1
    """
    print(timeit.timeit(s, 'from collections import defaultdict; import speedtest_init;'))

def test_counter2():
    print(timeit.timeit('d = Counter(speedtest_init.to_count);', 'from collections import Counter; import speedtest_init;'))


if __name__ == "__main__":
    test_init1()
    test_counter1()
    test_counter2()

The console output is:

C:\Python27\python.exe C:/Dev/codility/chlorum2014/speedtest.py
2.71501962931
65.7090444503
91.2953839048

Process finished with exit code 0

I think by default timeit() runs 1000000 times the code, so I need to divide the times by 1000000, but what is surprising is that the Counter is slower than the defaultdict().

Is that expected ?

EDIT:

Also using a dict is faster than a defaultdict(int):

def test_counter3():
    s = """\
    d = {};
    for i in speedtest_init.to_count:
        if i not in d:
            d[i] = 1
        else:
            d[i] += 1
    """
    print(timeit.timeit(stmt=s, setup='from collections import defaultdict; import speedtest_init;')

this last version is faster than the defaultdict(int) meaning that unless you care more about readability you should use the dict() rather than the defaultdict().

Answer

Martijn Pieters picture Martijn Pieters · Jan 6, 2015

Yes, this is expected; the Counter() constructor uses Counter.update() which uses self.get() to load initial values rather than rely on __missing__.

Moreover, the defaultdict __missing__ factory is handled entirely in C code, especially when using another type like int() that is itself implemented in C. The Counter source is pure Python and as such the Counter.__missing__ method requires a Python frame to execute.

Because dict.get() is still handled in C, the constructor approach is the faster approach for a Counter(), provided you use the same trick Counter.update() uses and alias the self.get lookup as a local first:

>>> import timeit
>>> import random
>>> import sys
>>> sys.version_info
sys.version_info(major=2, minor=7, micro=9, releaselevel='final', serial=0)
>>> to_count = [random.randint(0, 100) for r in range(60)]
>>> timeit.timeit('for i in to_count: c[i] += 1',
...               'from collections import Counter; from __main__ import to_count; c = Counter()',
...               number=10000)
0.2510359287261963
>>> timeit.timeit('for i in to_count: c[i] = c_get(i, 0) + 1',
...               'from collections import Counter; from __main__ import to_count; c = Counter(); c_get = c.get',
...               number=10000)
0.20978617668151855

Both defaultdict and Counter are helpful classes built for their functionality, not their performance; not relying on the __missing__ hook can be faster still:

>>> timeit.timeit('for i in to_count: d[i] = d_get(i, 0) + 1',
...               'from __main__ import to_count; d = {}; d_get = d.get',
...               number=10000)
0.11437392234802246

That's a regular dictionary using an aliased dict.get() method for maximum speed. But then you'll also have to re-implement the bag behaviour of Counter, or the Counter.most_common() method. The defaultdict use cases go way beyond counting.

In Python 3.2, updating a Counter() got a speed boost by adding a C library that handles this case; see issue 10667. Testing on Python 3.4, the Counter() constructor now beats the aliased dict.get case:

>>> timeit.timeit('Counter(to_count)',
...               'from collections import Counter; from __main__ import to_count',
...               number=100000)
0.8332311600097455
>>> timeit.timeit('for i in to_count: d[i] = d_get(i, 0) + 1',
...               'from __main__ import to_count; d = {}; d_get = d.get',
...               number=100000)
0.961191965994658
>>> import sys
>>> sys.version_info
sys.version_info(major=3, minor=4, micro=2, releaselevel='final', serial=0)

(Note: to get a meaningful timing result the number of iterations was increased from 10k to 100k; so if you compare these against the dict.get() case above you need to take the timing there times ten, at 1.144 seconds).