How does the strlen function work internally?

Manu picture Manu · Nov 9, 2010 · Viewed 50.9k times · Source

How does strlen() work internally? Are there any inherent bugs in the function?

Answer

paxdiablo picture paxdiablo · Nov 9, 2010

strlen usually works by counting the characters in a string until a \0 character is found. A canonical implementation would be:

size_t strlen (char *str) {
    size_t len = 0;
    while (*str != '\0') {
        str++;
        len++;
    }
    return len;
}

As for possible inherent bugs in the function, there are none - it works exactly as documented. That's not to say it doesn't have certain problems, to wit:

  • if you pass it a "string" that doesn't have a \0 at the end, you may run into problems but technically, that's not a C string (a) and it's your own fault.
  • you can't put \0 characters within your string but, again, it wouldn't be a C string in that case.
  • it's not the most efficient way - you could store a length up front so you could get the length much quicker.

But none of these are bugs, they're just consequences of a design decision.

On that last bullet point, see also this excellent article by Joel Spolsky where he discusses various string formats and their characteristics, including normal C strings (with a terminator), Pascal strings (with a length) and the combination of the two, null terminated Pascal strings.

Though he has a more, shall we say, "colorful" term for that final type, one which frequently comes to mind whenever I thing of Python's excellent (and totally unrelated) f-strings :-)


(a) A C string is defined as a series of non-terminator characters (any character other than \0) followed by a terminator. Hence this definition disallows both embedded terminators within the sequence, and sequences without such a terminator. Or, putting it more succinctly (as per the ISO C standard):

A string is a contiguous sequence of characters terminated by and including the first null character.