I have seen a lot of conflicting answers about this. Many people love to quote that php functions alone will not protect you from xss.
What XSS exactly can make it through htmlspecialchars and what can make it through htmlentities?
I understand the difference between the functions but not the different levels of xss protection you are left with. Could anyone explain?
htmlspecialchars() will NOT protect you against UTF-7 XSS exploits, that still plague Internet Explorer, even in IE 9: http://securethoughts.com/2009/05/exploiting-ie8-utf-7-xss-vulnerability-using-local-redirection/
For instance:
<?php
$_GET['password'] = 'asdf&ddddd"fancy˝quotes˝';
echo htmlspecialchars($_GET['password'], ENT_COMPAT | ENT_HTML401, 'UTF-8') . "\n";
// Output: asdf&ddddd"fancyË
echo htmlentities($_GET['password'], ENT_COMPAT | ENT_HTML401, 'UTF-8') . "\n";
// Output: asdf&ddddd"fancyËquotes
You should always use htmlentities and very rarely use htmlspecialchars when sanitizing user input. ALso, you should always strip tags before. And for really important and secure sites, you should NEVER trust strip_tags(). Use HTMLPurifier for PHP.