If the following statements are true,
Content-Type: text/html; charset=UTF-8
.<script>
tags in the document.are there any cases where htmlspecialchars($input, ENT_QUOTES, 'UTF-8')
(converting &
, "
, '
, <
, >
to the corresponding named HTML entities) is not enough to protect against cross-site scripting when generating HTML on a web server?
htmlspecialchars()
is enough to prevent document-creation-time HTML injection with the limitations you state (ie no injection into tag content/unquoted attribute).
However there are other kinds of injection that can lead to XSS and:
There are no <script> tags in the document.
this condition doesn't cover all cases of JS injection. You might for example have an event handler attribute (requires JS-escaping inside HTML-escaping):
<div onmouseover="alert('<?php echo htmlspecialchars($xss) ?>')"> // bad!
or, even worse, a javascript: link (requires JS-escaping inside URL-escaping inside HTML-escaping):
<a href="javascript:alert('<?php echo htmlspecialchars($xss) ?>')"> // bad!
It is usually best to avoid these constructs anyway, but especially when templating. Writing <?php echo htmlspecialchars(urlencode(json_encode($something))) ?>
is quite tedious.
And... injection issues can happen on the client-side as well (DOM XSS); htmlspecialchars()
won't protect you against a piece of JavaScript writing to innerHTML
(commonly .html()
in poor jQuery scripts) without explicit escaping.
And... XSS has a wider range of causes than just injections. Other common causes are:
allowing the user to create links, without checking for known-good URL schemes (javascript:
is the most well-known harmful scheme but there are more)
deliberately allowing the user to create markup, either directly or through light-markup schemes (like bbcode which is invariably exploitable)
allowing the user to upload files (which can through various means be reinterpreted as HTML or XML)