python3 memoryerror when producing a large list

Question 1

python3 memoryerror when producing a large list

python memory mandelbrot

轩字语 · Aug 17, 2015 · Viewed 7.2k times · Source

Answer

Answer

You say you are creating a list with 1.6 billion elements. Each of those is a complex number which contains 2 floats. A Python complex number takes 24 bytes (at least on my system: sys.getsizeof(complex(1.0,1.0)) gives 24), so you'll need over 38GB just to store the values, and that's before you even start looking at the list itself.

Your list with 1.6 billion elements won't fit at all on a 32-bit system (6.4GB with 4 byte pointers), so you need to go to a 64-bit system with 8 byte pointers and at will need 12.8GB just for the pointers.

So, no way you're going to do that unless you upgrade to a 64-bit OS with maybe 64GB RAM (though it might need more).

Question 2

I'm a beginner. I recently see the Mandelbrot set which is fantastic, so I decide to draw this set with python. But there is a problem,I got 'memoryerror' when I run this code.

This statement num_set = gen_num_set(10000) will produce a large list, about 20000*20000*4 = 1600000000. When I use '1000' instead of '10000', I can run code successfully.

My computer's memory is 4GB and the operating system is window7 32bit. I want to know if this problem is limit of my computer or there is some way to optimize my code.

Thanks.

#!/usr/bin/env python3.4

import matplotlib.pyplot as plt
import numpy as np
import random,time
from multiprocessing import *

def first_quadrant(n):
    start_point = 1 / n
    n = 2*n
    return gen_complex_num(start_point,n,1)        

def second_quadrant(n):
    start_point = 1 / n
    n = 2*n
    return gen_complex_num(start_point,n,2)

def third_quadrant(n):
    start_point = 1 / n
    n = 2*n
    return gen_complex_num(start_point,n,3)

def four_quadrant(n):
    start_point = 1 / n
    n = 2*n
    return gen_complex_num(start_point,n,4)

def gen_complex_num(start_point,n,quadrant):
    complex_num = []
    if quadrant == 1:        
        for i in range(n):
            real = i*start_point
            for j in range(n):
                imag = j*start_point
                complex_num.append(complex(real,imag))
        return complex_num
    elif quadrant == 2:
        for i in range(n):
            real = i*start_point*(-1)
            for j in range(n):
                imag = j*start_point
                complex_num.append(complex(real,imag))
        return complex_num
    elif quadrant == 3:
        for i in range(n):
            real = i*start_point*(-1)
            for j in range(n):
                imag = j*start_point*(-1)
                complex_num.append(complex(real,imag))
        return complex_num
    elif quadrant == 4:
        for i in range(n):
            real = i*start_point
            for j in range(n):
                imag = j*start_point*(-1)
                complex_num.append(complex(real,imag))
        return complex_num            

def gen_num_set(n):
    return [first_quadrant(n), second_quadrant(n), third_quadrant(n), four_quadrant(n)]

def if_man_set(num_set):
    iteration_n = 10000
    man_set = []
    z = complex(0,0)
    for c in num_set:
        if_man = 1
        for i in range(iteration_n):
            if abs(z) > 2:
                if_man = 0
                z = complex(0,0)
                break
            z = z*z + c
        if if_man:          
            man_set.append(c)        
    return man_set


def plot_scatter(x,y):
    #plt.plot(x,y)

    color = ran_color()
    plt.scatter(x,y,c=color)
    plt.show()

def ran_num():
    return random.random()

def ran_color():
    return [ran_num() for i in range(3)]

def plot_man_set(man_set):
    z_real = []
    z_imag = []
    for z in man_set:
        z_real.append(z.real)
        z_imag.append(z.imag)
    plot_scatter(z_real,z_imag)


if __name__ == "__main__":
    start_time = time.time()
    num_set = gen_num_set(10000)    
    with Pool(processes=4) as pool:
        #use multiprocess
        set_part = pool.map(if_man_set, num_set)
    man_set = []
    for i in set_part:
        man_set += i
    plot_man_set(man_set)
    end_time = time.time()
    use_time = end_time - start_time
    print(use_time)

python3 memoryerror when producing a large list

Answer

Related questions