I need to display external resources loaded via cross domain requests and make sure to only display "safe" content.
Could use Prototype's String#stripScripts to remove script blocks. But handlers such as onclick
or onerror
are still there.
Is there any library which can at least
embed
or object
).So are any JavaScript related links and examples out there?
Update 2016: There is now a Google Closure package based on the Caja sanitizer.
It has a cleaner API, was rewritten to take into account APIs available on modern browsers, and interacts better with Closure Compiler.
Shameless plug: see caja/plugin/html-sanitizer.js for a client side html sanitizer that has been thoroughly reviewed.
It is white-listed, not black-listed, but the whitelists are configurable as per CajaWhitelists
If you want to remove all tags, then do the following:
var tagBody = '(?:[^"\'>]|"[^"]*"|\'[^\']*\')*';
var tagOrComment = new RegExp(
'<(?:'
// Comment body.
+ '!--(?:(?:-*[^->])*--+|-?)'
// Special "raw text" elements whose content should be elided.
+ '|script\\b' + tagBody + '>[\\s\\S]*?</script\\s*'
+ '|style\\b' + tagBody + '>[\\s\\S]*?</style\\s*'
// Regular name
+ '|/?[a-z]'
+ tagBody
+ ')>',
'gi');
function removeTags(html) {
var oldHtml;
do {
oldHtml = html;
html = html.replace(tagOrComment, '');
} while (html !== oldHtml);
return html.replace(/</g, '<');
}
People will tell you that you can create an element, and assign innerHTML
and then get the innerText
or textContent
, and then escape entities in that. Do not do that. It is vulnerable to XSS injection since <img src=bogus onerror=alert(1337)>
will run the onerror
handler even if the node is never attached to the DOM.