This may be a stupid question but I will ask it anyway. I have a generator object:
>>> def gen():
... for i in range(10):
... yield i
...
>>> obj=gen()
I can measure it's size:
>>> obj.__sizeof__()
24
It is said that generators get consumed:
>>> for i in obj:
... print i
...
0
1
2
3
4
5
6
7
8
9
>>> obj.__sizeof__()
24
...but obj.__sizeof__()
remains the same.
With strings it works as I expected:
>>> 'longstring'.__sizeof__()
34
>>> 'str'.__sizeof__()
27
I would be thankful if someone could enlighten me.
__sizeof__()
does not do what you think it does. The method returns the internal size in bytes for the given object, not the number of items a generator is going to return.
Python cannot beforehand know the size of a generator. Take for example the following endless generator (example, there are better ways to create a counter):
def count():
count = 0
while True:
yield count
count += 1
That generator is endless; there is no size assignable to it. Yet the generator object itself takes memory:
>>> count.__sizeof__()
88
You don't normally call __sizeof__()
you leave that to the sys.getsizeof()
function, which also adds garbage collector overhead.
If you know a generator is going to be finite and you have to know how many items it returns, use:
sum(1 for item in generator)
but note that that exhausts the generator.