if I'm sanitizing my DB inserts, and also escaping the HTML I write with htmlentities($text, ENT_COMPAT, 'UTF-8')
- is there any point to also filtering the inputs with xss_clean? What other benefits does it give?
xss_clean() is extensive, and also silly. 90% of this function does nothing to prevent xss. Such as looking for the word alert
but not document.cookie
. No hacker is going to use alert
in their exploit, they are going to hijack the cookie with xss or read a CSRF token to make an XHR.
However running htmlentities()
or htmlspecialchars()
with it is redundant. A case where xss_clean()
fixes the issue and htmlentities($text, ENT_COMPAT, 'UTF-8')
fails is the following:
<?php
print "<img src='$var'>";
?>
A simple poc is:
http://localhost/xss.php?var=http://domain/some_image.gif'%20onload=alert(/xss/)
This will add the onload=
event handler to the image tag. A method of stopping this form of xss is htmlspecialchars($var,ENT_QUOTES);
or in this case xss_clean()
will also prevent this.
However, quoting from the xss_clean() documentation:
Nothing is ever 100% foolproof, of course, but I haven't been able to get anything passed the filter.
That being said, XSS is an output problem
not an input problem
. For instance this function cannot take into account that the variable is already within a <script>
tag or event handler. It also doesn't stop DOM Based XSS. You need to take into consideration how you are using the data in order to use the best function. Filtering all data on input is a bad practice. Not only is it insecure but it also corrupts data which can make comparisons difficult.