Find length of std::wstring

sazr picture sazr · Feb 21, 2013 · Viewed 7.1k times · Source

How can I determine the length(number of characters) in a std::wstring?

Using myStr.length() gives the byte size(I think) but its not the number of characters. Do I need to create my own function to find the number of characters or is there a native C++ way or a native WinAPI way?

Answer

jogojapan picture jogojapan · Feb 21, 2013

std::wstring::length() will give you the number of characters, where character is defined as the atomic unit of the wstring object, i.e. a wchar. This is what the Standard means when it refers to characters (see this post for some more details on the use of the word in the Standard).

However, when it comes to Unicode characters, whether one wchar corresponds to one Unicode character depends on the encoding used inside the wstring. If UTF-16 is used, which is often (but not necessarily) the case, one wchar will correspond to one Unicode character only for the base multilingual plane (i.e. all character sets derived from ISO-8859 as well as most of the commonly used CJK characters, but not some of the more exotic (e.g. classical Chinese) characters)(*). If you want to get the character count right for all Unicode characters in that case, you need to use a Unicode-aware library (e.g. ICU), or code it yourself.

(*)There are additional problems if combining characters are used, as @一二三 points out correctly. Counting those correctly is also best done using appropriate libraries.