I've been trying to understand why Python 3 is actually taking much time compared with Python 2 in certain situations, below are few cases I've verified from python 3.4 to python 2.7.
Note: I've gone through some of the questions like Why is there no xrange function in Python3? and loop in python3 much slower than python2 and Same code slower in Python3 as compared to Python2, but I feel that I didn't get the actual reason behind this issue.
I've tried this piece of code to show how it is making difference:
MAX_NUM = 3*10**7
# This is to make compatible with py3.4.
try:
xrange
except:
xrange = range
def foo():
i = MAX_NUM
while i> 0:
i -= 1
def foo_for():
for i in xrange(MAX_NUM):
pass
When I've tried running this programme with py3.4 and py2.7 I've got below results.
Note: These stats came through a 64 bit
machine with 2.6Ghz
processor and calculated the time using time.time()
in single loop.
Output : Python 3.4
-----------------
2.6392083168029785
0.9724123477935791
Output: Python 2.7
------------------
1.5131521225
0.475143909454
I really don't think that there has been changes applied to while
or xrange
from 2.7 to 3.4, I know range
has been started acting as to xrange
in py3.4 but as documentation says
range()
now behaves likexrange()
used to behave, except it works with values of arbitrary size. The latter no longer exists.
this means change from xrange
to range
is very much equal to a name change but working with arbitrary values.
I've verified disassembled byte code as well.
Below is the disassembled byte code for function foo()
:
Python 3.4:
---------------
13 0 LOAD_GLOBAL 0 (MAX_NUM)
3 STORE_FAST 0 (i)
14 6 SETUP_LOOP 26 (to 35)
>> 9 LOAD_FAST 0 (i)
12 LOAD_CONST 1 (0)
15 COMPARE_OP 4 (>)
18 POP_JUMP_IF_FALSE 34
15 21 LOAD_FAST 0 (i)
24 LOAD_CONST 2 (1)
27 INPLACE_SUBTRACT
28 STORE_FAST 0 (i)
31 JUMP_ABSOLUTE 9
>> 34 POP_BLOCK
>> 35 LOAD_CONST 0 (None)
38 RETURN_VALUE
python 2.7
-------------
13 0 LOAD_GLOBAL 0 (MAX_NUM)
3 STORE_FAST 0 (i)
14 6 SETUP_LOOP 26 (to 35)
>> 9 LOAD_FAST 0 (i)
12 LOAD_CONST 1 (0)
15 COMPARE_OP 4 (>)
18 POP_JUMP_IF_FALSE 34
15 21 LOAD_FAST 0 (i)
24 LOAD_CONST 2 (1)
27 INPLACE_SUBTRACT
28 STORE_FAST 0 (i)
31 JUMP_ABSOLUTE 9
>> 34 POP_BLOCK
>> 35 LOAD_CONST 0 (None)
38 RETURN_VALUE
And below is the disassembled byte code for function foo_for()
:
Python: 3.4
19 0 SETUP_LOOP 20 (to 23)
3 LOAD_GLOBAL 0 (xrange)
6 LOAD_GLOBAL 1 (MAX_NUM)
9 CALL_FUNCTION 1 (1 positional, 0 keyword pair)
12 GET_ITER
>> 13 FOR_ITER 6 (to 22)
16 STORE_FAST 0 (i)
20 19 JUMP_ABSOLUTE 13
>> 22 POP_BLOCK
>> 23 LOAD_CONST 0 (None)
26 RETURN_VALUE
Python: 2.7
-------------
19 0 SETUP_LOOP 20 (to 23)
3 LOAD_GLOBAL 0 (xrange)
6 LOAD_GLOBAL 1 (MAX_NUM)
9 CALL_FUNCTION 1
12 GET_ITER
>> 13 FOR_ITER 6 (to 22)
16 STORE_FAST 0 (i)
20 19 JUMP_ABSOLUTE 13
>> 22 POP_BLOCK
>> 23 LOAD_CONST 0 (None)
26 RETURN_VALUE
If we compare both the byte codes they've produced the same disassembled byte code.
Now I'm wondering what change from 2.7 to 3.4 is really causing this huge change in execution time in the given piece of code.
The difference is in the implementation of the int
type. Python 3.x uses the arbitrary-sized integer type (long
in 2.x) exclusively, while in Python 2.x for values up to sys.maxint
a simpler int
type is used that uses a simple C long
under the hood.
Once you limit your loops to long
integers, Python 3.x is faster:
>>> from timeit import timeit
>>> MAX_NUM = 3*10**3
>>> def bar():
... i = MAX_NUM + sys.maxsize
... while i > sys.maxsize:
... i -= 1
...
Python 2:
>>> timeit(bar, number=10000)
5.704327821731567
Python 3:
>>> timeit(bar, number=10000)
3.7299320790334605
I used sys.maxsize
as sys.maxint
was dropped from Python 3, but the integer value is basically the same.
The speed difference in Python 2 is thus limited to the first (2 ** 63) - 1 integers on 64-bit, (2 ** 31) - 1 integers on 32 bit systems.
Since you cannot use the long
type with xrange()
on Python 2, I did not include a comparison for that function.