I have a JSON object where the value of one element is a string. In this string there are the characters "<RPC>"
. I take this entire JSON object and in my ASP.NET server code, I perform the following to take the object named rpc_response
and add it to the data in a POST response:
var serializer = new System.Web.Script.Serialization.JavaScriptSerializer();
HttpContext.Current.Response.AddHeader("Pragma", "no-cache");
HttpContext.Current.Response.AddHeader("Cache-Control", "private, no-cache");
HttpContext.Current.Response.AddHeader("Content-Disposition", "inline; filename=\"files.json\"");
HttpContext.Current.Response.Write(serializer.Serialize(rpc_response));
HttpContext.Current.Response.ContentType = "application/json";
HttpContext.Current.Response.StatusCode = 200;
After the object is serialized, I receive it on the other end (not a web browser), and that particular string looks like: \u003cRPC\u003e
.
What can I do to prevent these (and other) characters from not being encoded properly, still being able to serialize my JSON object?
The characters are being encoded "properly"!1 Use a working JSON library to correctly access the JSON data - it is a valid JSON encoding.
Escaping these characters prevents HTML injection via JSON - and makes the JSON XML-friendly. That is, even if the JSON is emited directly into JavaScript (as is done fairly often as JSON is a valid2 subset of JavaScript), it cannot be used to terminate the <script>
element early because the relevant characters (e.g. <
, >
) are encoded within JSON itself.
The standard JavaScriptSerializer
does not have the ability to change this behavior. Such escaping might be configurable (or different) in the Json.NET implementation - but, it shouldn't matter because a valid JSON client/library must understand the \u
escapes.
1 From RFC 4627: The application/json Media Type for JavaScript Object Notation (JSON),
Any character may be escaped. If the character is in the Basic Multilingual Plane (U+0000 through U+FFFF), then it may be represented as a six-character sequence: a reverse solidus, followed by the lowercase letter u, followed by four hexadecimal digits that encode the character's code point ..
See also C# To transform Facebook Response to proper encoded string (which is also related to the JSON escaping).
2 There is a rare case when this does not hold, but ignoring (or accounting for) that..