PHP - htmlspecialchars and UTF-8

Lizard picture Lizard · Dec 3, 2009 · Viewed 10.3k times · Source

I am just trying to confirm something with htmlspecialchars. I have just converted my database into UTF-8, and I think I finally have it all working, but throughout my code I have used the PHP htmlspecialchars function:

htmlspecialchars($val, ENT_QUOTES,'ISO-8859-1',false);

Do I need to worry about changing all the entries to:

htmlspecialchars($val, ENT_QUOTES,'UTF-8',false);

The PHP documention suggests I don't need to, but is that true?

For the purposes of this function, the charsets ISO-8859-1, ISO-8859-15, UTF-8, cp866, cp1251, cp1252, and KOI8-R are effectively equivalent, as the characters affected by htmlspecialchars() occupy the same positions in all of these charsets.

Answer

VolkerK picture VolkerK · Dec 3, 2009

All characters "handled" by htmlspecialchars() are in the 7-bit/US ASCII range. And those are identical (and unmistakable) in the mentioned encodings. So yes, it will do no harm if you don't change the encoding parameter. But I'd encourage you to do so anyway.