C Programming - Functionality of strlen

Ryan Barker picture Ryan Barker · Feb 28, 2014 · Viewed 9.1k times · Source

I'm working to try and understand some string functions so I can more effectively use them in later coding projects, so I set up the simple program below:

#include <stdio.h>
#include <string.h>

int main (void)
{
// Declare variables:
char test_string[5];
char test_string2[] = { 'G', 'O', '_', 'T', 'E', 'S', 'T'};
int init; 
int length = 0;
int match;

// Initialize array:
for (init = 0; init < strlen(test_string); init++)
{    test_string[init] = '\0';
}

// Fill array:
test_string[0] = 'T';
test_string[1] = 'E';
test_string[2] = 'S';
test_string[3] = 'T';

// Get Length:
length = strlen(test_string);

// Get number of characters from string 1 in string 2:
match = strspn(test_string, test_string2);

printf("\nstrlen return = %d", length);
printf("\nstrspn return = %d\n\n", match);

return 0;
}

I expect to see a return of:

strlen return = 4 strspn return = 4

However, I see strlen return = 6 and strspn return = 4. From what I understand, char test_string[5] should allocate 5 bytes of memory and place hex 00 into the fifth byte. The for loop (which should not even be nessecary) should then set all the bytes of memory for test_string to hex 00. Then, the immediately proceeding lines should fill test_string bytes 1 through 4 (or test_string[0] through test_string[3]) with what I have specified. Calling strlen at this point should return a 4, because it should start at the address of string 0 and count an increment until it hits the first null character, which is at string[4]. Yet strlen returns 6. Can anyone explain this? Thanks!

Answer

Keith Thompson picture Keith Thompson · Feb 28, 2014
char test_string[5];

test_string is an array of 5 uninitialized char objects.

for (init = 0; init < strlen(test_string); init++)

Kaboom. strlen scans for the first '\0' null character. Since the contents of test_string are garbage, the behavior is undefined. It might return a small value if there happens to be a null character, or a large value or program crash if there don't happen to be any zero bytes in test_string.

Even if that weren't the case, evaluating strlen() in the header of a for loop is inefficient. Each strlen() call has to re-scan the entire string (assuming you've given it a valid string), so if your loop worked it would be O(N2).

If you want test_string to contain just zero bytes, you can initialize it that way:

char test_string[5] = "";

or, since you initialize the first 4 bytes later:

char test_string[5] = "TEST";

or just:

char test_string[] = "TEST";

(The latter lets the compiler figure out that it needs 5 bytes.)

Going back to your declarations:

char test_string2[] = { 'G', 'O', '_', 'T', 'E', 'S', 'T'};

This causes test_string2 to be 7 bytes long, without a trailing '\0' character. That means that passing test_string2 to any function that expects a pointer to a string will cause undefined behavior. You probably want something like:

char test_string2[] = "GO_TEST";