I am looking for simple practical C++ examples on how to use ICU.
The ICU home page is not helpful in this regard.
I am not interested on what and why Unicode.
The few demos are not self contained and not compilable examples ( where are the includes? )
I am looking for something like 'Hello, World' of:
How to open and read a file encoded in UTF-8
How to use STL / Boost string functions to manipulate UTF-8 encoded strings
etc.
There's no special way to read a UTF-8 file unless you need to process a byte order mark (BOM). Because of the way UTF-8 encoding works, functions that read ANSI strings can also read UTF-8 strings.
The following code will read the contents of a file (ANSI or UTF-8) and do a couple of conversions.
#include <fstream>
#include <string>
#include <unicode/unistr.h>
int main(int argc, char** argv) {
std::ifstream f("...");
std::string s;
while (std::getline(f, s)) {
// at this point s contains a line of text
// which may be ANSI or UTF-8 encoded
// convert std::string to ICU's UnicodeString
UnicodeString ucs = UnicodeString::fromUTF8(StringPiece(s.c_str()));
// convert UnicodeString to std::wstring
std::wstring ws;
for (int i = 0; i < ucs.length(); ++i)
ws += static_cast<wchar_t>(ucs[i]);
}
}
Take a look at the online API reference.
If you want to use ICU through Boost, see Boost.Locale.