Is sanitizing JSON necessary?

Golo Roden picture Golo Roden · Sep 22, 2014 · Viewed 24.4k times · Source

I think it's a well-known best practice on the web to mistrust any input. The sentence

"All input is evil."

is probably the most cited quote with respect to input validation. Now, for HTML you can use tools such as DOMPurify to sanitize it.

My question is if I have a Node.js server running Express and body-parser middleware to receive and parse JSON, do I need to run any sanitizing as well?

My (maybe naive?) thoughts on this are that JSON is only data, no code, and if somebody sends invalid JSON, body-parser (which uses JSON.parse() internally) will fail anyway, so I know that my app will receive a valid JavaScript object. As long as I don't run eval on that or call a function, I should be fine, shouldn't I?

Am I missing something?

Answer

jfriend00 picture jfriend00 · Sep 22, 2014

Since JSON.parse() does not run any code in the data to be parsed, it is not vulnerable the way eval() is, but there are still things you should do to protect the integrity of your server and application such as:

  1. Apply exception handlers in the appropriate place as JSON.parse() can throw an exception.
  2. Don't make assumptions about what data is there, you must explicitly test for data before using it.
  3. Only process properties you are specifically looking for (avoiding other things that might be in the JSON).
  4. Validate all incoming data as legitimate, acceptable values.
  5. Sanitize the length of data (to prevent DOS issues with overly large data).
  6. Don't put this incoming data into places where it could be further evaluated such as directly into the HTML of the page or injected directly into SQL statements without further sanitization to make sure it is safe for that environment.

So, to answer your question directly, "yes" there is more to do than just using body-parser though it is a perfectly fine front line for first processing the data. The next steps for what you do with the data once you get it from body-parser do matter in many cases and can require extra care.


As an example, here's a parsing function that expects an object with properties that applies some of these checks and gives you a filtered result that only contains the properties you were expecting:

// pass expected list of properties and optional maxLen
// returns obj or null
function safeJSONParse(str, propArray, maxLen) {
    var parsedObj, safeObj = {};
    try {
        if (maxLen && str.length > maxLen) {
            return null;
        } else {
            parsedObj = JSON.parse(str);
            if (typeof parsedObj !== "object" || Array.isArray(parsedObj)) {
                safeObj = parseObj;
            } else {
                // copy only expected properties to the safeObj
                propArray.forEach(function(prop) {
                    if (parsedObj.hasOwnProperty(prop)) {
                        safeObj[prop] = parseObj[prop];
                    }
                });
            }
            return safeObj;
        }
    } catch(e) {
        return null;
    }
}