I was trying to implement a Miller-Rabin primality test, and was puzzled why it was taking so long (> 20 seconds) for midsize numbers (~7 digits). I eventually found the following line of code to be the source of the problem:
x = a**d % n
(where a
, d
, and n
are all similar, but unequal, midsize numbers, **
is the exponentiation operator, and %
is the modulo operator)
I then I tried replacing it with the following:
x = pow(a, d, n)
and it by comparison it is almost instantaneous.
For context, here is the original function:
from random import randint
def primalityTest(n, k):
if n < 2:
return False
if n % 2 == 0:
return False
s = 0
d = n - 1
while d % 2 == 0:
s += 1
d >>= 1
for i in range(k):
rand = randint(2, n - 2)
x = rand**d % n # offending line
if x == 1 or x == n - 1:
continue
for r in range(s):
toReturn = True
x = pow(x, 2, n)
if x == 1:
return False
if x == n - 1:
toReturn = False
break
if toReturn:
return False
return True
print(primalityTest(2700643,1))
An example timed calculation:
from timeit import timeit
a = 2505626
d = 1520321
n = 2700643
def testA():
print(a**d % n)
def testB():
print(pow(a, d, n))
print("time: %(time)fs" % {"time":timeit("testA()", setup="from __main__ import testA", number=1)})
print("time: %(time)fs" % {"time":timeit("testB()", setup="from __main__ import testB", number=1)})
Output (run with PyPy 1.9.0):
2642565
time: 23.785543s
2642565
time: 0.000030s
Output (run with Python 3.3.0, 2.7.2 returns very similar times):
2642565
time: 14.426975s
2642565
time: 0.000021s
And a related question, why is this calculation almost twice as fast when run with Python 2 or 3 than with PyPy, when usually PyPy is much faster?
See the Wikipedia article on modular exponentiation. Basically, when you do a**d % n
, you actually have to calculate a**d
, which could be quite large. But there are ways of computing a**d % n
without having to compute a**d
itself, and that is what pow
does. The **
operator can't do this because it can't "see into the future" to know that you are going to immediately take the modulus.