UTF8 vs. UTF16 vs. char* vs. what? Someone explain this mess to me!

dicroce picture dicroce · Oct 5, 2008 · Viewed 27.2k times · Source

I've managed to mostly ignore all this multi-byte character stuff, but now I need to do some UI work and I know my ignorance in this area is going to catch up with me! Can anyone explain in a few paragraphs or less just what I need to know so that I can localize my applications? What types should I be using (I use both .Net and C/C++, and I need this answer for both Unix and Windows).

Answer

Dylan Beattie picture Dylan Beattie · Oct 5, 2008

Check out Joel Spolsky's The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)

EDIT 20140523: Also, watch Characters, Symbols and the Unicode Miracle by Tom Scott on YouTube - it's just under ten minutes, and a wonderful explanation of the brilliant 'hack' that is UTF-8