How does strlen()
work internally? Are there any inherent bugs in the function?
strlen
usually works by counting the characters in a string until a \0
character is found. A canonical implementation would be:
size_t strlen (char *str) {
size_t len = 0;
while (*str != '\0') {
str++;
len++;
}
return len;
}
As for possible inherent bugs in the function, there are none - it works exactly as documented. That's not to say it doesn't have certain problems, to wit:
\0
at the end, you may run into problems but technically, that's not a C string (a) and it's your own fault.\0
characters within your string but, again, it wouldn't be a C string in that case.But none of these are bugs, they're just consequences of a design decision.
On that last bullet point, see also this excellent article by Joel Spolsky where he discusses various string formats and their characteristics, including normal C strings (with a terminator), Pascal strings (with a length) and the combination of the two, null terminated Pascal strings.
Though he has a more, shall we say, "colorful" term for that final type, one which frequently comes to mind whenever I thing of Python's excellent (and totally unrelated) f-strings :-)
(a) A C string is defined as a series of non-terminator characters (any character other than \0
) followed by a terminator. Hence this definition disallows both embedded terminators within the sequence, and sequences without such a terminator. Or, putting it more succinctly (as per the ISO C standard):
A string is a contiguous sequence of characters terminated by and including the first null character.