Python generator objects: __sizeof__()

root picture root · Sep 18, 2012 · Viewed 18.9k times · Source

This may be a stupid question but I will ask it anyway. I have a generator object:

>>> def gen():
...     for i in range(10):
...         yield i
...         
>>> obj=gen()

I can measure it's size:

>>> obj.__sizeof__()
24

It is said that generators get consumed:

>>> for i in obj:
...     print i
...     
0
1
2
3
4
5
6
7
8
9
>>> obj.__sizeof__()
24

...but obj.__sizeof__() remains the same.

With strings it works as I expected:

>>> 'longstring'.__sizeof__()
34
>>> 'str'.__sizeof__()
27

I would be thankful if someone could enlighten me.

Answer

Martijn Pieters picture Martijn Pieters · Sep 18, 2012

__sizeof__() does not do what you think it does. The method returns the internal size in bytes for the given object, not the number of items a generator is going to return.

Python cannot beforehand know the size of a generator. Take for example the following endless generator (example, there are better ways to create a counter):

def count():
    count = 0
    while True:
        yield count
        count += 1

That generator is endless; there is no size assignable to it. Yet the generator object itself takes memory:

>>> count.__sizeof__()
88

You don't normally call __sizeof__() you leave that to the sys.getsizeof() function, which also adds garbage collector overhead.

If you know a generator is going to be finite and you have to know how many items it returns, use:

sum(1 for item in generator)

but note that that exhausts the generator.