Unzip buffer with Python?

nowox picture nowox · Dec 8, 2015 · Viewed 8k times · Source

I have a buffer of bytes read from a library call and I would like to unzip the content which is a single text file.

I tried with zlib, but I get this error:

>>> import zlib
>>> zlib.decompress(buffer)
error: Error -3 while decompressing data: incorrect header check

However with ZipFile it works, but I have to use a temporary file:

import zipfile
f = open('foo.zip', 'wb')
f.write(buffer)
f.close()
z = ZipFile('foo.zip')
z.extractall()
z.close()
with open('foo.txt', 'r') as f:
    uncompressed_buffer = f.read()

Is it possible to use zlib and how can I avoid using a temporary file?

Answer

Robᵩ picture Robᵩ · Dec 8, 2015

Is it possible to use zlib

No, zlib is not designed to operate on ZIP files.

and how can I avoid using a temporary file?

Use io.BytesIO:

import zipfile
import io

buffer = b'PK\x03\x04\n\x00\x00\x00\x00\x00\n\\\x88Gzzo\xed\x03\x00\x00\x00\x03\x00\x00\x00\x07\x00\x1c\x00foo.txtUT\t\x00\x03$\x14gV(\x14gVux\x0b\x00\x01\x041\x04\x00\x00\x041\x04\x00\x00hi\nPK\x01\x02\x1e\x03\n\x00\x00\x00\x00\x00\n\\\x88Gzzo\xed\x03\x00\x00\x00\x03\x00\x00\x00\x07\x00\x18\x00\x00\x00\x00\x00\x01\x00\x00\x00\xb4\x81\x00\x00\x00\x00foo.txtUT\x05\x00\x03$\x14gVux\x0b\x00\x01\x041\x04\x00\x00\x041\x04\x00\x00PK\x05\x06\x00\x00\x00\x00\x01\x00\x01\x00M\x00\x00\x00D\x00\x00\x00\x00\x00'

z = zipfile.ZipFile(io.BytesIO(buffer))

# The following three lines are alternatives. Use one of them
# according to your need:
foo = z.read('foo.txt')        # Reads the data from "foo.txt"
foo2 = z.read(z.infolist()[0]) # Reads the data from the first file
z.extractall()                 # Copies foo.txt to the filesystem

z.close()


print foo
print foo2