Platform's default charset on different platforms?

Robert picture Robert · Feb 16, 2012 · Viewed 34.3k times · Source

Some legacy code relies on the platform's default charset for translations. For Windows and Linux installations in the "western world" I know what that means. But thinking about Russian or Asian platforms I am totally unsure what their platform's default charset is (just UTF-16?).

Therefore I would like to know what I would get when executing the following code line:

System.out.println("Default Charset=" + Charset.defaultCharset());

PS:

I don't want to discuss the problems of charsets and their difference to Unicode here. I just want to collect what operating systems will result in what specific charset. Please post only concrete values!

Answer

Aaron Digulla picture Aaron Digulla · Feb 16, 2012

That's a user specific setting. On many modern Linux systems, it's UTF-8. On Macs, it’s MacRoman. In the US on Windows, it's often CP1250, in Europe it's CP1252. In China, you often find simplified chinese (Big5 or a GB*).

But that’s the system default, which each user can change at any time. Which is probably the solution: Set the encoding when you start your app using the system property file.encoding

See this answer how to do that. I suggest to put this into a small script which starts your app, so the user default isn't tainted.