Unicode file in notepad

FSm picture FSm · Dec 15, 2012 · Viewed 36.6k times · Source

What does it mean when I save a text file as "Unicode" in notepad? is it Utf-8, Utf-16 or Utf-32? Thanks in advance.

Answer

Jukka K. Korpela picture Jukka K. Korpela · Dec 15, 2012

In Notepad, as in Windows software in general, “Unicode” as an encoding name means UTF-16 Little Endian (UTF-16LE). (I first thought it’s not real UTF-16, because Notepad++ recognizes it as UCS-2 and shows the content as garbage, but re-checking with BabelPad, I concluded that Notepad can encode even non-BMP characters correctly.)

Similarly, “Unicode big endian” means UTF-16 Big Endian. And “ANSI” means the system’s native legacy encoding, e.g. the 8-bit windows-1252 encoding in Western versions of Windows.