Question mark characters on HTML if I use UTF-8 and weird characters on SQL data if I use ISO-8859-1

Gus picture Gus · Jul 26, 2013 · Viewed 14.1k times · Source

I'm making a page with latin accents like á, ã, ç, and others. This site pulls data from a SQL database. I'm using this on the <head>:
<meta http-equiv="content-type" content="text/html; charset=ISO-8859-1"/>
With this header the HTML accented characters are fine, but the SQL data is being displayed as çãinstead of ão, and if I change the charset from ISO-8859-1 to UTF-8, all the HTML accented characters are displayed by a � (question mark) and the SQL data shows accents just fine.
Is there any way to fix it besides escaping either all the HTML characters or the SQL ones?

PS: I've already tried mysql_set_charset('utf8'); and SET NAMES utf8, neither worked to me.

Answer

e-sushi picture e-sushi · Jul 26, 2013

When you see question marks, your document has not been stored in the correct encoding (should be UTF-8 in your case) or it isn't being served with the correct headers and/or meta tags.

If you want to work with special characters like è, your html document should be saved as UTF-8 and served as UTF-8:

<meta http-equiv="content-type" content="text/html; charset=UTF-8" />

Additionally, you have to use UTF-8 for your database connections:

…("SET NAMES utf8");
…("SET CHARACTER SET utf8");

And last but not least, you have to use UTF-8 for the database itself.

As you'll notice, you're already on the correct path… you just have to "use it all" (as I described above) instead of trying one thing at a time and ignoring the rest. It's the combination that makes it work. Simpler said: if you go "UTF-8", you will have to think "UTF-8" everywhere and stick to it in your html files, your headers, your meta tags, your database connections, and the database(s). Use the same encoding everywhere and stick to it, instead of using "a bit UTF-8 here and a bit ISO-whatever there".