Is there a way to avoid this memory error?

Jesse Neff picture Jesse Neff · Apr 18, 2013 · Viewed 13.3k times · Source

I'm currently working through the problems on Project Euler, and so far I've come up with this code for a problem.

from itertools import combinations
import time

def findanums(n):
    l = []
    for i in range(1, n + 1):
        s = []
        for j in range(1, i):
            if i % j == 0:
                s.append(j)
        if sum(s) > i:
            l.append(i)
    return l

start = time.time() #start time

limit = 28123

anums = findanums(limit + 1) #abundant numbers (1..limit)
print "done finding abundants", time.time() - start

pairs = combinations(anums, 2)
print "done finding combinations", time.time() - start

sums = map(lambda x: x[0]+x[1], pairs)
print "done finding all possible sums", time.time() - start

print "start main loop"
answer = 0
for i in range(1,limit+1):
    if i not in sums:
        answer += i
print "ANSWER:",answer

When I run this I run into a MemoryError.

The traceback:

File "test.py", line 20, in <module>
    sums = map(lambda x: x[0]+x[1], pairs)

I've tried to prevent it by disabling garbage collection from what I've been able to get from Google but to no avail. Am I approaching this the wrong way? In my head this feels like the most natural way to do it and I'm at a loss at this point.

SIDE NOTE: I'm using PyPy 2.0 Beta2(Python 2.7.4) because it is so much faster than any other python implementation I've used, and Scipy/Numpy are over my head as I'm still just beginning to program and I don't have an engineering or strong math background.

Answer

Ofiris picture Ofiris · Apr 18, 2013

As Kevin mention in the comments, your algorithm might be wrong, but anyway your code is not optimized.

When using very big lists, it is common to use generators, there is a famous, great Stackoverflow answer explaining the concepts of yield and generator - What does the "yield" keyword do in Python?

The problem is that your pairs = combinations(anums, 2) doesn't generate a generator, but generates a large object which is much more memory consuming.

I changed your code to have this function, since you iterating over the collection only once, you can lazy evaluate the values:

def generator_sol(anums1, s):
      for comb in itertools.combinations(anums1, s):
        yield comb

Now instead of pairs = combinations(anums, 2) which generates a large object. Use:

pairs = generator_sol(anums, 2)

Then, instead of using the lambda I would use another generator:

sums_sol = (x[0]+x[1] for x in pairs)

Another tip, you better look at xrange which is more suitable, it doens't generate a list but an xrange object which is more efficient in many cases (such as here).