What does set_locale(LC_CTYPE, 'C'); actually do?

Russ picture Russ · Mar 8, 2011 · Viewed 18.4k times · Source

When my PHP script is run with UTF-8 encoding, using non-ASCII characters, some PHP functions like strtolower() don't work.

I could use mb_strtolower, but this script can be run on all sorts of different platforms and configurations, and the multibyte string extension might not be available. I could check whether the function exists before use, but I have string functions littered throughout my code and would rather not replace every instance.

Someone suggested using set_locale(LC_CTYPE, 'C'), which he says causes the string functions to work correctly. This sounds fine, but I don't want to introduce that change without understanding exactly what it is doing. I have used set_locale to change the formatting of numbers before, but I have not used the LC_CTYPE flag before, and I don't really understand what it does. What does the value 'C' mean?

Answer

ThiefMaster picture ThiefMaster · Mar 8, 2011

C means "use whatever locale is hard coded" (and since most *NIX programs are written in C, it's called C). However, it is usually not an UTF-8 locale.

If you are using multibyte charsets such as UTF-8 you cannot use the regular string functions - using the mb_ counterparts is required. However, almost every PHP installation should have this extension enabled.