Unicode in Content-Disposition header

Ranjeet picture Ranjeet · Mar 30, 2010 · Viewed 15.7k times · Source

I am using HttpContext object implemented in HttpHandler child to download a file, when I have non-ascii characters in file name it looks weird in IE whereas it looks fine in Firefox.

below is the code:-

       context.Response.ContentType = ".cs";
context.Response.AppendHeader("Content-Length", data.Length.ToString());
context.Response.AppendHeader("Content-Disposition", String.Format("attachment; filename={0}",filename));
        context.Response.OutputStream.Write(data, 0, data.Length);

context.Response.Flush();

when I supply 'ß' 'ä' 'ö' 'ü' 'ó' 'ß' 'ä' 'ö' 'ü' 'ó' in file name field it looks different than what I have in file name it looks fine in firefox. adding EncodingType and charset has been of no use.

In ie it is 'ß''ä''ö''ü''ó''ß''ä''ö''ü'_'ó' and in firefox it is 'ß' 'ä' 'ö' 'ü' 'ó' 'ß' 'ä' 'ö' 'ü' 'ó'.

Any Idea how this can be fixed?

Answer

Sergej Andrejev picture Sergej Andrejev · Mar 30, 2010

I had similar problem. You have to use HttpUtility.UrlEncode or Server.UrlEncode to encode filename. Also I remember firefox didn't need it. Moreoverit ruined filename when it's url-encoded. My code:

// IE needs url encoding, FF doesn't support it, Google Chrome doesn't care
if (Request.Browser.IsBrowser ("IE"))
{
    fileName = Server.UrlEncode(fileName);
}

Response.Clear ();
Response.AddHeader ("content-disposition", String.Format ("attachment;filename=\"{0}\"", fileName));
Response.AddHeader ("Content-Length", data.Length.ToString (CultureInfo.InvariantCulture));
Response.ContentType = mimeType;
Response.BinaryWrite(data);

Edit

I have read specification more carefully. First of all RFC2183 states that:

Current [RFC 2045] grammar restricts parameter values (and hence Content-Disposition filenames) to US-ASCII.

But then I found references that [RFC 2045] is absolete and one must reference RFC 2231, which states:

Asterisks ("*") are reused to provide the indicator that language and character set information is present and encoding is being used. A single quote ("'") is used to delimit the character set and language information at the beginning of the parameter value. Percent signs ("%") are used as the encoding flag, which agrees with RFC 2047.

Which means that you can use UrlEncode for non-ascii symbols, as long as you include the encoding as stated in the rfc. Here is an example:

string.Format("attachment; filename=\"{0}\"; filename*=UTF-8''{0}", Server.UrlEncode(fileName, Encoding.UTF8));

Note that filename is included in addition to filename* for backwards compatibility. You can also choose another encoding and modify the parameter accordingly, but UTF-8 covers everything.