What's the right way to decode a string that has special HTML entities in it?

Dan Tao picture Dan Tao · Sep 13, 2011 · Viewed 248.3k times · Source

Say I get some JSON back from a service request that looks like this:

{
    "message": "We're unable to complete your request at this time."
}

I'm not sure why that apostraphe is encoded like that ('); all I know is that I want to decode it.

Here's one approach using jQuery that popped into my head:

function decodeHtml(html) {
    return $('<div>').html(html).text();
}

That seems (very) hacky, though. What's a better way? Is there a "right" way?

Answer

Rob W picture Rob W · Sep 13, 2011

This is my favourite way of decoding HTML characters. The advantage of using this code is that tags are also preserved.

function decodeHtml(html) {
    var txt = document.createElement("textarea");
    txt.innerHTML = html;
    return txt.value;
}

Example: http://jsfiddle.net/k65s3/

Input:

Entity:&nbsp;Bad attempt at XSS:<script>alert('new\nline?')</script><br>

Output:

Entity: Bad attempt at XSS:<script>alert('new\nline?')</script><br>