Should I change from UTF-8 to UTF-16 to accommodate Chinese characters in my HTML?

Aaron Salazar picture Aaron Salazar · Oct 5, 2010 · Viewed 31.8k times · Source

I am using ASP.NET MVC, MS SQL and IIS. I have a few users that have used Chinese characters in their profile info. However, when I display this information is shows up as æŽå¼·è¯ but they are correct in my database. Currently my UTF for my HTML pages is set to UTF-8. Should I change it to UTF-16? I understand there are a few problems that can come from this but what are my choices?

Thank you,

Aaron

Answer

Yuji picture Yuji · Oct 5, 2010

UTF-8 and UTF-16 encode exactly the same set of characters. It's not that UTF-8 doesn't cover Chinese characters and UTF-16 does. UTF-16 uses uniformly 16 bits to represent a character; while UTF-8 uses 1, 2, 3, up to a max of 4 bytes, depending on the character, so that an ASCII character is represented still as 1 byte. Start with this Wikipedia article to get the idea behind it.

So, there's little chance switching to UTF-16 will help you at all. There's a chance it makes things worse, as is discussed in the SO question you linked above. There's a problem somewhere else in your setup, which does not correctly take into account non-ASCII or non-Latin-1 characters. Make sure every part of your setup works in UTF-8.