UTF-16 string terminator

Ray picture Ray · May 7, 2011 · Viewed 9k times · Source

What is the string terminator sequence for a UTF-16 string?

EDIT:

Let me rephrase the question in an attempt to clarify. How's does the call to wcslen() work?

Answer

Michael Petrotta picture Michael Petrotta · May 7, 2011

Unicode does not define string terminators. Your environment or language does. For instance, C strings use 0x0 as a string terminator, as well as in .NET strings where a separate value in the String class is used to store the length of the string.

To answer your second question, wcslen looks for a terminating L'\0' character. Which as I read it, is any length of 0x00 bytes, depending on the compiler, but will likely be the two-byte sequence 0x00 0x00 if you're using UTF-16 (encoding U+0000, 'NUL')