When my PHP script is run with UTF-8 encoding, using non-ASCII characters, some PHP functions like strtolower()
don't work.
I could use mb_strtolower, but this script can be run on all sorts of different platforms and configurations, and the multibyte string extension might not be available. I could check whether the function exists before use, but I have string functions littered throughout my code and would rather not replace every instance.
Someone suggested using set_locale(LC_CTYPE, 'C')
, which he says causes the string functions to work correctly. This sounds fine, but I don't want to introduce that change without understanding exactly what it is doing. I have used set_locale to change the formatting of numbers before, but I have not used the LC_CTYPE
flag before, and I don't really understand what it does. What does the value 'C'
mean?
C
means "use whatever locale is hard coded" (and since most *NIX programs are written in C, it's called C
). However, it is usually not an UTF-8 locale.
If you are using multibyte charsets such as UTF-8 you cannot use the regular string functions - using the mb_
counterparts is required. However, almost every PHP installation should have this extension enabled.