How do strings and char arrays work in C?

temporary_user_name picture temporary_user_name · Oct 4, 2012 · Viewed 12.9k times · Source

No guides I've seen seem to explain this very well.

I mean, you can allocate memory for a char*, or write char[25] instead? What's the difference? And then there are literals, which can't be manipulated? What if you want to assign a fixed string to a variable? Like, stringVariable = "thisIsALiteral", then how do you manipulate it afterwards?

Can someone set the record straight here? And in the last case, with the literal, how do you take care of null-termination? I find this very confusing.


EDIT: The real problem seems to be that as I understand it, you have to juggle these different constructs in order to accomplish even simple things. For instance, only char * can be passed as an argument or return value, but only char[] can be assigned a literal and modified. I feel like it's obvious that we frequently/always needs to be able to do both, and that's where my pitfall is.

Answer

Sergey Kalinichenko picture Sergey Kalinichenko · Oct 4, 2012

What is the difference between an allocated char* and char[25]?

The lifetime of a malloc-ed string is not limited by the scope of its declaration. In plain language, you can return malloc-ed string from a function; you cannot do the same with char[25] allocated in the automatic storage, because its memory will be reclaimed upon return from the function.

Can literals be manipulated?

String literals cannot be manipulated in place, because they are allocated in read-only storage. You need to copy them into a modifiable space, such as static, automatic, or dynamic one, in order to manipulate them. This cannot be done:

char *str = "hello";
str[0] = 'H'; // <<== WRONG! This is undefined behavior.

This will work:

char str[] = "hello";
str[0] = 'H'; // <<=== This is OK

This works too:

char *str = malloc(6);
strcpy(str, "hello");
str[0] = 'H'; // <<=== This is OK too

How do you take care of null termination of string literals?

C compiler takes care of null termination for you: all string literals have an extra character at the end, filled with \0.