Pass io.BytesIO object to gzip.GzipFile and write to GzipFile

timakro picture timakro · Sep 26, 2015 · Viewed 7.7k times · Source

I basically want to do exactly whats in the documentation of gzip.GzipFile:

Calling a GzipFile object’s close() method does not close fileobj, since you might wish to append more material after the compressed data. This also allows you to pass a io.BytesIO object opened for writing as fileobj, and retrieve the resulting memory buffer using the io.BytesIO object’s getvalue() method.

With a normal file object it works as expected.

>>> import gzip
>>> fileobj = open("test", "wb")
>>> fileobj.writable()
True
>>> gzipfile = gzip.GzipFile(fileobj=fileobj)
>>> gzipfile.writable()
True

But I can't manage to get a writable gzip.GzipFile object when passing a io.BytesIO object.

>>> import io
>>> bytesbuffer = io.BytesIO()
>>> bytesbuffer.writable()
True
>>> gzipfile = gzip.GzipFile(fileobj=bytesbuffer)
>>> gzipfile.writable()
False

Do I have to open the io.BytesIO explicit for writing, and how would I do so? Or is there a difference between a file object returned by open(filename, "wb") and a object returned by io.BytesIO() I didn't think of?

Answer

Martijn Pieters picture Martijn Pieters · Sep 26, 2015

Yes, you need to explicitly set the GzipFile mode to 'w'; it would otherwise try and take the mode from the file object, but a BytesIO object has no .mode attribute:

>>> import io
>>> io.BytesIO().mode
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: '_io.BytesIO' object has no attribute 'mode'

Just specify the mode explicitly:

gzipfile = gzip.GzipFile(fileobj=fileobj, mode='w')

Demo:

>>> import gzip
>>> gzip.GzipFile(fileobj=io.BytesIO(), mode='w').writable()
True

In principle a BytesIO object is opened in 'w+b' mode, but GzipFile would only look at the first character of a file mode.