Displaying unicode symbols in HTML

Peter Craig picture Peter Craig · Nov 8, 2009 · Viewed 157.8k times · Source

I want to simply display the tick (✔) and cross (✘) symbols in a HTML page but it shows up as either a box or goop ✔ - obviously something to do with the encoding.

I have set the meta tag to show utf-8 but obviously I'm missing something.

<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

Edit/Solution: From comments made, using FireBug I found the headers being passed by my page were in fact "Content-Type: text/html" and not UTF-8. Looking at the file format using Notepad++ showed my file was formatted as "UTF-8 without BOM". Changing this to just UTF-8 the symbols now show correctly... but firebug still seems to indicate the same content-type.

Answer

Nicolas Goy picture Nicolas Goy · Nov 8, 2009

You should ensure the HTTP server headers are correct.

In particular, the header:

Content-Type: text/html; charset=utf-8

should be present.

The meta tag is ignored by browsers if the HTTP header is present.

Also ensure that your file is actually encoded as UTF-8 before serving it, check/try the following:

  • Ensure your editor save it as UTF-8.
  • Ensure your FTP or any file transfer program does not mess with the file.
  • Try with HTML encoded entities, like &#uuu;.
  • To be really sure, hexdump the file and look as the character, for the ✔, it should be E2 9C 94 .

Note: If you use an unicode character for which your system can't find a glyph (no font with that character), your browser should display a question mark or some block like symbol. But if you see multiple roman characters like you do, this denotes an encoding problem.