A Fast Prime Number Sieve in Python

cobie picture cobie · Apr 14, 2013 · Viewed 17.8k times · Source

I have been going through prime number generation in python using the sieve of Eratosthenes and the solutions which people tout as a relatively fast option such as those in a few of the answers to a question on optimising prime number generation in python are not straightforward and the simple implementation which I have here rivals them in efficiency. My implementation is given below

def sieve_for_primes_to(n):
    size = n//2
    sieve = [1]*size
    limit = int(n**0.5)
    for i in range(1,limit):
        if sieve[i]:
            val = 2*i+1
            tmp = ((size-1) - i)//val 
            sieve[i+val::val] = [0]*tmp
    return sieve


print [2] + [i*2+1 for i, v in enumerate(sieve_for_primes_to(10000000)) if v and i>0]

Timing the execution returns

python -m timeit -n10 -s "import euler" "euler.sieve_for_primes_to(1000000)"
10 loops, best of 3: 19.5 msec per loop

While the method described in the answer to the above linked question as being the fastest from the python cookbook is given below

import itertools
def erat2( ):
    D = {  }
    yield 2
    for q in itertools.islice(itertools.count(3), 0, None, 2):
        p = D.pop(q, None)
        if p is None:
            D[q*q] = q
            yield q
        else:
            x = p + q
            while x in D or not (x&1):
                x += p
            D[x] = p

def get_primes_erat(n):
  return list(itertools.takewhile(lambda p: p<n, erat2()))

When run it gives

python -m timeit -n10 -s "import euler" "euler.get_primes_erat(1000000)"
10 loops, best of 3: 697 msec per loop

My question is why do people tout the above from the cook book which is relatively complex as the ideal prime generator?

Answer

ABri picture ABri · Sep 25, 2013

I transformed your code to fit into the prime sieve comparison script of @unutbu at Fastest way to list all primes below N as follows:

def sieve_for_primes_to(n):
    size = n//2
    sieve = [1]*size
    limit = int(n**0.5)
    for i in range(1,limit):
        if sieve[i]:
            val = 2*i+1
            tmp = ((size-1) - i)//val 
            sieve[i+val::val] = [0]*tmp
    return [2] + [i*2+1 for i, v in enumerate(sieve) if v and i>0]

On my MBPro i7 the script is fast calculating all primes < 1000000 but actually 1.5 times slower than rwh_primes2, rwh_primes1 (1.2), rwh_primes (1.19) and primeSieveSeq (1.12) (@andreasbriese at the page end).