gzip compression of chunked encoding response?

Heinrich Schmetterling picture Heinrich Schmetterling · Mar 12, 2011 · Viewed 32.3k times · Source

I'm trying to get my webserver to correctly gzip an http response that is chunk encoding.

my understanding of the non-gzip response is that it looks like this:

<the response headers>

and then for each chunk,

<chunk length in hex>\r\n<chunk>\r\n

and finally, a zero length chunk:

0\r\n\r\n

I've tried to get gzip compression working and I could use some help figuring out what should actually be returned. This documentation implies that the entire response should be gzipped, as opposed to gzipping each chunk:

HTTP servers sometimes use compression (gzip) or deflate methods to optimize transmission.
Chunked transfer encoding can be used to delimit parts of the compressed object.
In this case the chunks are not individually compressed. Instead, the complete payload 
is compressed and the output of the compression process is chunk encoded.

I tried to gzip the entire thing and return the response even without chunked, and it didn't work. I tried setting the Content-Encoding header to "gzip". Can someone explain what changes must be made to the above scheme to support gzipping of chunks? Thanks.

Answer

sosiouxme picture sosiouxme · Feb 2, 2012

In case the other answers weren't clear enough:

First you gzip the body with zlib (this can be done in a stream so you don't need the whole thing in memory at once, which is the whole point of chunking).

Then you send that compressed body in chunks (presumably the ones provided by the gzip stream, with the chunk header to declare how long it is), with the Content-Encoding: gzip and Transfer-Encoding: chunked headers (and no Content-Length header).

If you're using gzip or zcat or some such utility for the compression, it probably won't work. Needs to be zlib. If you're creating the chunks and then compressing them, that definitely won't work. If you think you're doing this right and it's not working, you might try taking a packet trace and asking questions based on that and any error messages you're getting.