C++ WCHAR manipulations

Geradlus_RU picture Geradlus_RU · Nov 13, 2012 · Viewed 19.6k times · Source

I'm developing a tiny Win32 app in C++. I've studied C++ fundamentals long time ago, so now I completely confused because of character strings in C++. There were no WCHAR or TCHAR only char and String. After a little investigation I've decided not to use TCHAR.

My issue is very simple I think, but I can't find clear guide how to manipulate strings in C++. Affected by PHP coding last few years I've expected something simple with strings manipulations and was wrong!

Simply, all I need is to put new data to a character string.

    WCHAR* cs = L"\0";
    swprintf( cs, "NEW DATA" );

This was my first attempt. When debugging my app I've investigated that swprintf puts only first 2 chars to my cs var. I've resolved my problem this way:

    WCHAR cs[1000];
    swprintf( cs, "NEW DATA" );

But generally this trick could fail, because in my case new data is not constant value but another variable, that could potentialy be wider, than 1000 chars long. And my code is looks like this:

    WCHAR cs[1000];
    WCHAR* nd1;
    WCHAR* nd2;
    wcscpy(nd1, L"Some value");
    wcscpy(nd2, L"Another value"); // Actually these vars stores the path for user selected folder
    swprintf( cs, "The paths are %s and %s", nd1, nd2);

In this case there is possibility than nd1 and nd2 total character count could be greater than 1000 chars so critical data will be lost.

The question is how can I copy all data I need to WCHAR string declared this way WCHAR* wchar_var; without losing anything?

P.S. Since I'm Russian the question may be unclear. Let me now about that, and I'll try to explain my issue more clear and complex.

Answer

Mr.C64 picture Mr.C64 · Nov 13, 2012

In modern Windows programming, it's OK to just ignore TCHAR and instead use wchar_t (WCHAR) and Unicode UTF-16.

(TCHAR is a model of the past, when you wanted to have a single code base, and produce both ANSI/MBCS and Unicode builds changing some preprocessor switches like _UNICODE and UNICODE.)

In any case, you should use C++ and convenient string classes to simplify your code. You can use ATL::CString (which corresponds to CStringW in Unicode builds, which are the default since VS2005), or STL's std::wstring.

Using CString, you can do:

CString str1 = L"Some value";
CString str2 = L"Another value";
CString cs;
cs.Format(L"The paths are %s and %s", str1.GetString(), str2.GetString());

CString also provides proper overloads of operator+ to concatenate strings (so you don't have to calculate the total length of the resulting string, dynamically allocate a buffer for the destination string or check existing buffer size, call wcscpy, wcscat, don't forget to release the buffer, etc.)

And you can simply pass instances of CString to Win32 APIs expecting const wchar_t* (LPCWSTR/PCWSTR) parameters, since CString offers an implicit conversion operator to const wchar_t*.