How to specify to accept multipart/related content type with particular content types for body part in the accept header field

Gunter Zeilinger picture Gunter Zeilinger · Mar 31, 2016 · Viewed 10.7k times · Source

RFC 7231 - HTTP/1.1 Semantics and Content, 5.3 Content Negotiation does not define how to specify to accept a multipart/related content type with particular content types for body parts in the accept header field.

For instance, how to express acceptance of multipart/related content with text/html body parts

Accept: multipart/related;type=text/html

or

Accept: multipart/related,text/html

And if you want to specify precedences for different html flavours?

Accept: multipart/related;type=text/html;q=0.7,
   multipart/related;type=text/html;level=1,
   multipart/related;type=text/html;level=2;q=0.4

or

Accept: multipart/related,text/html;q=0.7,
   text/html;level=1,
   text/html;level=2;q=0.4

What's right? Both?

Answer

DaSourcerer picture DaSourcerer · Mar 31, 2016

To start off, HTTP is a MIME-like protocol, not a MIME-compliant one. To quote RFC 7230, section 2.1:

Messages are passed in a format similar to that used by Internet mail [RFC5322] and the Multipurpose Internet Mail Extensions (MIME) [RFC2045] (see Appendix A of [RFC7231] for the differences between HTTP and MIME messages).

This is important to keep in mind, as this grants us some liberties when dealing with MIME content.

The Accept header is subject to RFC 7231, sec. 5.3.2. The syntax described there allows for a list of comma-seperated mediatypes (see RFC 7230, sec. 7) with an arbitrary number of mediatype-specific parameters each in addition to the HTTP-specific weight parameter q (see RFC 7231, sec. 5.3.1).

Section 3.1.1.1 discusses which mediatypes are considered valid for the Accept and Content-Type headers:

HTTP uses Internet media types [RFC2046] in the Content-Type and Accept header fields in order to provide open and extensible data typing and type negotiation. [...] Internet media types ought to be registered with IANA according to the procedures defined in [BCP13]

[BCP13] is referring to RFC 6838, eventually leading to the IANA Media Types Registry.

It bears mentioning that the syntax of the Accept header does not require any parameters to be present; they are all optional as far as the HTTP spec is concerned. If there are required parameters, they must be required directly by the mediatype in question:

The presence or absence of a parameter might be significant to the processing of a media-type, depending on its definition within the media type registry.

The multipart/related MIME type itself is subject to RFC 2387. Section 3.1 of which explicitly makes the type paramater mandatory. It is also a single value, not a list. Interestingly, the HTTP spec is stressing out the importance of the presence of the boundary parameter over RFC 2046, section 5.1.1. From RFC 7231, section 3.1.1.4:

All multipart types share a common syntax, as defined in Section 5.1.1 of [RFC2046], and include a boundary parameter as part of the media type value.

My guess is that it never occured to the authors that one would put a multipart mediatype into an Accept header, which would render the boundary useless. This could indeed be a candidate for an errata (Julian?). So technically, the absolutely correct™ way to request this would be:

Accept: multipart/related; type=text/html; boundary=--my-top-notch-boundary-

In reality, implementors seem to be inclined to deliberately ignore these requirements as this example shows. I usually do not advocate against following the RFC, but I think it actually makes sense here to skip the boundary parameter. Bearing in mind that this is a request header used in content negotiation and not a dedscription of seom actual content with a specified boundary between message parts, I cannot think of a use case where requesting such a boundary were legit; unles you are out for causing some mischief. But then again you were requesting a manipulated request for yourself. I am undecided on omitting the type parameter, though. IMHO doing so would imply type=*/*, which is efectively an "I don't care, send whatever you see fit." While this may result in a response perfectly in line with RFC2387, I would personally feel uneasy about having this little control over the returned content type. (On a side note: You may always want to check the content type of responses. A 2xx code is no guarantee that you got what you requested)

Now if you send out a request with Accept: mutlipart/related, text/html, you are requesting either several parts of unspecified type or alternatively a single HTML document. If you want to negotiate the content, you will need to request several variations of multipart/related with different types:

Accept: multipart/related; type=text/html,
        multipart/related; type=text/plaintext

(Note: Line continuation added for improved legibility. Please take note that line continuation has been deprecated and should no longer be used in the context of HTTP.)

Regarding your example, I was quite surprised to find that the syntax for this mediatype is extraordinarily strict when it comes to parameters. The situation is as follows:

  • The Accept header as such is subject to RFC 7231, sec. 5.3.2
  • The mediatype(s) and subtype(s) are straight out of the IANA Media Types Registry per RFC 6838
  • The parameters are being handled as follows:
    • q is under authority of RFC 7231, sec. 5.3.1
    • boundary is under authority of RFC 2046, sec. 5.1.1
    • Remaining parameters are subject to the mediatypes' respective RFCs. In this case this means that type is required, followed by the optional parameters start and start-info
    • Unrecognized parameters are to be discarded as per RFC 2046, section 1:

MIME implementations must also ignore any parameters whose names they do not recognize.

So, if level were a recognized parameter (currently this is not even the case for the text/html mediatype. And yes, I am aware it appears in multiple examples), the correct solution were indeed this:

Accept: multipart/related; type=text/html; q=0.7,
        multipart/related; type=text/html; level=1,
        multipart/related; type=text/html; level=2; q=0.4

But stripping out the level parameter, we're down to this:

Accept: multipart/related; type=text/html; q=0.7,
        multipart/related; type=text/html,
        multipart/related; type=text/html; q=0.4

which is sementically the same as:

Accept: multipart/related; type=text/html