How to display special characters in PHP

Phil Tune picture Phil Tune · Oct 3, 2012 · Viewed 113.2k times · Source

I've seen this asked several times, but not with a good resolution. I have the following string:

$string = "<p>Résumé</p>";

I want to print or echo the string, but the output will return <p>R�sum�</p>. So I try htmlspecialchars() or htmlentities() which outputs &lt;p&gt;R&eacute;sum&eacute;&lt;p&gt; and the browser renders &lt;p&gt;R&eacute;sum&eacute;&lt;p&gt;. I want it, obviously, to render this:

Résumé

And I'm using UTF-8:

header("Content-type: text/html; charset=UTF-8");

What am I missing here? Why do echo and print output a for any special character? To clarify, the string is actually an entire HTML file stored in a database. The real-world application is not just that one small line.

Answer

Phil Tune picture Phil Tune · Oct 8, 2012

After much banging-head-on-table, I have a bit better understanding of the issue that I wanted to post for anyone else who may have had this issue.

While the UTF-8 character set will display special characters on the client, the server, on the other hand, may not be so accomodating and would print special characters such as à and è as and .

To make sure your server will print them correctly, use the ISO-8859-1 charset:

<?php
    /*Just for your server-side code*/
    header('Content-Type: text/html; charset=ISO-8859-1');
?>
<!DOCTYPE html>
<html>
    <head>
        <meta charset="utf-8"><!-- Your HTML file can still use UTF-8-->
        <title>Untitled Document</title>
    </head>
    <body>
        <?= "àè" ?>
    </body>
</html>

This will print correctly: àè


Edit (4 years later):

I have a little better understanding now. The reason this works is that the client (browser) is being told, through the response header(), to expect an ISO-8859-1 text/html file. (As others have mentioned, you can also do this by updating your .ini or .htaccess files.) Then, once the browser begins to parse that given file into the DOM, the output will obey any <meta charset=""> rule but keep your ISO characters intact.