Transfer-Encoding: gzip vs. Content-Encoding: gzip

Evgeniy Berezovsky picture Evgeniy Berezovsky · Jul 25, 2012 · Viewed 81.3k times · Source

What is the current state of affairs when it comes to whether to do

Transfer-Encoding: gzip

or a

Content-Encoding: gzip

when I want to allow clients with e.g. limited bandwidth to signal their willingness to accept a compressed response and the server have the final say whether or not to compress.

The latter is what e.g. Apache's mod_deflate and IIS do, if you let it take care of compression. Depending on the size of the content to be compressed, it will do the additional Transfer-Encoding: chunked.

It will also include a Vary: Accept-Encoding, which already hints at the problem. Content-Encoding seems to be part of the entity, so changing the Content-Encoding amounts to a change of the entity, i.e. a different Accept-Encoding header means e.g. a cache cannot use its cached version of the otherwise identical entity.

Is there a definite answer on this that I have missed (and that's not buried inside a message in a long thread in some apache newsgroup)?

My current impression is:

  • Transfer-Encoding would in fact be the right way to do what is mostly done with Content-Encoding by existing server and client implentations
  • Content-Encoding, because of its semantic implications, carries a couple of issues (what should the server do to the ETag when it transparently compresses a response?)
  • The reason is chicken'n'egg: Browsers don't support it because servers don't because browsers don't

So I am assuming the right way would be a Transfer-Encoding: gzip (or, if I additionally chunk the body, it would become Transfer-Encoding: gzip, chunked). And no reason to touch Vary or ETag or any other header in that case as it's a transport-level thing.

For now I don't care too much about the 'hop-by-hop'-ness of Transfer-Encoding, something that others seem to be concerned about first and foremost, because proxies might uncompress and forward uncompressed to the client. However, proxies might just as well forward it as-is (compressed), if the original request has the proper Accept-Encoding header, which in case of all browsers that I know is a given.

Btw, this issue is at least a decade old, see e.g. https://bugzilla.mozilla.org/show_bug.cgi?id=68517 .

Any clarification on this will be appreciated. Both in terms of what is considered standards-compliant and what is considered practical. For example, HTTP client libraries only supporting transparent "Content-Encoding" would be an argument against practicality.

Answer

Evgeniy Berezovsky picture Evgeniy Berezovsky · Jul 26, 2012

Quoting Roy T. Fielding, one of the authors of RFC 2616:

changing content-encoding on the fly in an inconsistent manner (neither "never" nor "always) makes it impossible for later requests regarding that content (e.g., PUT or conditional GET) to be handled correctly. This is, of course, why performing on-the-fly content-encoding is a stupid idea, and why I added Transfer-Encoding to HTTP as the proper way to do on-the-fly encoding without changing the resource.

Source: https://issues.apache.org/bugzilla/show_bug.cgi?id=39727#c31

In other words: Don't do on-the-fly Content-Encoding, use Transfer-Encoding instead!

Edit: That is, unless you want to serve gzipped content to clients that only understand Content-Encoding. Which, unfortunately, seems to be most of them. But be aware that you leave the realms of the spec and might run into issues such as the one mentioned by Fielding as well as others, e.g. when caching proxies are involved.