About the "Character set" option in Visual Studio

Lion King picture Lion King · Feb 19, 2012 · Viewed 41.1k times · Source

I have an inquiry about the "Character set" option in Visual Studio. The Character Set options are:

  • Not Set
  • Use Unicode Character Set
  • Use Multi-Byte Character Set

I want to know what the difference between three options in Character Set?

Also if I choose something of them, will affect the support for languages ​​other than English (like RTL languages)?

Answer

Hans Passant picture Hans Passant · Feb 19, 2012

It is a compatibility setting, intended for legacy code that was written for old versions of Windows that were not Unicode enabled. Versions in the Windows 9x family, Windows ME was the last and widely ignored one. With "Not Set" or "Use Multi-Byte Character Set" selected, all Windows API functions that take a string as an argument are redefined to a little compatibility helper function that translates char* strings to wchar_t* strings, the API's native string type.

Such code critically depends on the default system code page setting. The code page maps 8-bit characters to Unicode which selects the font glyph. Your program will only produce correct text when the machine that runs your code has the correct code page. Characters whose value >= 128 will get rendered wrong if the code page doesn't match.

Always select "Use Unicode Character Set" for modern code. Especially when you want to support languages with a right-to-left layout and you don't have an Arabic or Hebrew code page selected on your dev machine. Use std::wstring or wchar_t[] in your code. Getting actual RTL layout requires turning on the WS_EX_RTLREADING style flag in the CreateWindowEx() call.