Environment: Gcc/G++ Linux
I have a non-ascii file in file system and I'm going to open it.
Now I have a wchar_t*, but I don't know how to open it. (my trusted fopen only opens char* file)
Please help. Thanks a lot.
There are two possible answers:
If you want to make sure all Unicode filenames are representable, you can hard-code the assumption that the filesystem uses UTF-8 filenames. This is the "modern" Linux desktop-app approach. Just convert your strings from wchar_t
(UTF-32) to UTF-8 with library functions (iconv
would work well) or your own implementation (but lookup the specs so you don't get it horribly wrong like Shelwien did), then use fopen
.
If you want to do things the more standards-oriented way, you should use wcsrtombs
to convert the wchar_t
string to a multibyte char
string in the locale's encoding (which hopefully is UTF-8 anyway on any modern system) and use fopen
. Note that this requires that you previously set the locale with setlocale(LC_CTYPE, "")
or setlocale(LC_ALL, "")
.
And finally, not exactly an answer but a recommendation:
Storing filenames as wchar_t
strings is probably a horrible mistake. You should instead store filenames as abstract byte strings, and only convert those to wchar_t
just-in-time for displaying them in the user interface (if it's even necessary for that; many UI toolkits use plain byte strings themselves and do the interpretation as characters for you). This way you eliminate a lot of possible nasty corner cases, and you never encounter a situation where some files are inaccessible due to their names.