A coworker recently wrote a program in which he used a Python list as a queue. In other words, he used .append(x)
when needing to insert items and .pop(0)
when needing to remove items.
I know that Python has collections.deque
and I'm trying to figure out whether to spend my (limited) time to rewrite this code to use it. Assuming that we perform millions of appends and pops but never have more than a few thousand entries, will his list usage be a problem?
In particular, will the underlying array used by the Python list implementation continue to grow indefinitely have millions of spots even though the list only has a thousand things, or will Python eventually do a realloc
and free up some of that memory?
Some answers claimed a "10x" speed advantage for deque vs list-used-as-FIFO when both have 1000 entries, but that's a bit of an overbid:
$ python -mtimeit -s'q=range(1000)' 'q.append(23); q.pop(0)'
1000000 loops, best of 3: 1.24 usec per loop
$ python -mtimeit -s'import collections; q=collections.deque(range(1000))' 'q.append(23); q.popleft()'
1000000 loops, best of 3: 0.573 usec per loop
python -mtimeit
is your friend -- a really useful and simple micro-benchmarking approach! With it you can of course also trivially explore performance in much-smaller cases:
$ python -mtimeit -s'q=range(100)' 'q.append(23); q.pop(0)'
1000000 loops, best of 3: 0.972 usec per loop
$ python -mtimeit -s'import collections; q=collections.deque(range(100))' 'q.append(23); q.popleft()'
1000000 loops, best of 3: 0.576 usec per loop
(not very different for 12 instead of 100 items btw), and in much-larger ones:
$ python -mtimeit -s'q=range(10000)' 'q.append(23); q.pop(0)'
100000 loops, best of 3: 5.81 usec per loop
$ python -mtimeit -s'import collections; q=collections.deque(range(10000))' 'q.append(23); q.popleft()'
1000000 loops, best of 3: 0.574 usec per loop
You can see that the claim of O(1) performance for deque is well founded, while a list is over twice as slow around 1,000 items, an order of magnitude around 10,000. You can also see that even in such cases you're only wasting 5 microseconds or so per append/pop pair and decide how significant that wastage is (though if that's all you're doing with that container, deque has no downside, so you might as well switch even if 5 usec more or less won't make an important difference).