What is the current state of affairs when it comes to whether to do
Transfer-Encoding: gzip
or a
Content-Encoding: gzip
when I want to allow clients with e.g. limited bandwidth to signal their willingness to accept a compressed response and the server have the final say whether or not to compress.
The latter is what e.g. Apache's mod_deflate and IIS do, if you let it take care of compression. Depending on the size of the content to be compressed, it will do the additional Transfer-Encoding: chunked
.
It will also include a Vary: Accept-Encoding
, which already hints at the problem. Content-Encoding
seems to be part of the entity, so changing the Content-Encoding
amounts to a change of the entity, i.e. a different Accept-Encoding
header means e.g. a cache cannot use its cached version of the otherwise identical entity.
Is there a definite answer on this that I have missed (and that's not buried inside a message in a long thread in some apache newsgroup)?
My current impression is:
ETag
when it transparently compresses a response?)So I am assuming the right way would be a Transfer-Encoding: gzip
(or, if I additionally chunk the body, it would become Transfer-Encoding: gzip, chunked
). And no reason to touch Vary
or ETag
or any other header in that case as it's a transport-level thing.
For now I don't care too much about the 'hop-by-hop'-ness of Transfer-Encoding
, something that others seem to be concerned about first and foremost, because proxies might uncompress and forward uncompressed to the client. However, proxies might just as well forward it as-is (compressed), if the original request has the proper Accept-Encoding
header, which in case of all browsers that I know is a given.
Btw, this issue is at least a decade old, see e.g. https://bugzilla.mozilla.org/show_bug.cgi?id=68517 .
Any clarification on this will be appreciated. Both in terms of what is considered standards-compliant and what is considered practical. For example, HTTP client libraries only supporting transparent "Content-Encoding" would be an argument against practicality.
Quoting Roy T. Fielding, one of the authors of RFC 2616:
changing content-encoding on the fly in an inconsistent manner (neither "never" nor "always) makes it impossible for later requests regarding that content (e.g., PUT or conditional GET) to be handled correctly. This is, of course, why performing on-the-fly content-encoding is a stupid idea, and why I added Transfer-Encoding to HTTP as the proper way to do on-the-fly encoding without changing the resource.
Source: https://issues.apache.org/bugzilla/show_bug.cgi?id=39727#c31
In other words: Don't do on-the-fly Content-Encoding, use Transfer-Encoding instead!
Edit: That is, unless you want to serve gzipped content to clients that only understand Content-Encoding. Which, unfortunately, seems to be most of them. But be aware that you leave the realms of the spec and might run into issues such as the one mentioned by Fielding as well as others, e.g. when caching proxies are involved.