Recently I have come across this problem which I am unable to understand by myself.
What do these three Expressions REALLY mean?
*ptr++
*++ptr
++*ptr
I have tried Ritchie. But unfortunately was unable to follow what he told about these 3 operations.
I know they are all performed to increment the pointer/the value pointed to. I can also guess there may be a lot of things about precedence and order of evaluation. Like one increments the pointer first then fetches the content of that pointer, one simply fetches the content and then increments the pointer etc etc. As you can see, I don't have a clear understanding about their actual operations, which i would like to clear as soon as possible. But I am truly lost when I get a chance to apply them in programs. For example:
int main()
{
const char *p = "Hello";
while(*p++)
printf("%c",*p);
return 0;
}
gives me this output:
ello
But my expectation was that it would print Hello
.
One final request -- Please give me examples for how each expression works in a given code snippet. As most of the time only a mere paragraph of theory gets flown over my head.
Here's a detailed explanation which I hope will be helpful. Let's begin with your program, as it's the simplest to explain.
int main()
{
const char *p = "Hello";
while(*p++)
printf("%c",*p);
return 0;
}
The first statement:
const char* p = "Hello";
declares p
as a pointer to char
. When we say "pointer to a char
", what does that mean? It means that the value of p
is the address of a char
; p
tells us where in memory there is some space set aside to hold a char
.
The statement also initializes p
to point to the first character in the string literal "Hello"
. For the sake of this exercise, it's important to understand p
as pointing not to the entire string, but only to the first character, 'H'
. After all, p
is a pointer to one char
, not to the entire string. The value of p
is the address of the 'H'
in "Hello"
.
Then you set up a loop:
while (*p++)
What does the loop condition *p++
mean? Three things are at work here that make this puzzling (at least until familiarity sets in):
++
and indirection *
1. Precedence. A quick glance at the precedence table for operators will tell you that postfix increment has a higher precedence (16) than dereference / indirection (15). This means that the complex expression *p++
is going to be grouped as: *(p++)
. That is to say, the *
part will be applied to the value of the p++
part. So let's take the p++
part first.
2. Postfix expression value. The value of p++
is the value of p
before the increment. If you have:
int i = 7;
printf ("%d\n", i++);
printf ("%d\n", i);
the output will be:
7
8
because i++
evaluates to i
before the increment. Similarly p++
is going to evaluate to the current value of p
. As we know, the current value of p
is the address of 'H'
.
So now the p++
part of *p++
has been evaluated; it's the current value of p
. Then the *
part happens. *(current value of p)
means: access the value at the address held by p
. We know that the value at that address is 'H'
. So the expression *p++
evaluates to 'H'
.
Now hold on a minute, you're saying. If *p++
evaluates to 'H'
, why doesn't that 'H'
print in the above code? That's where side effects come in.
3. Postfix expression side effects. The postfix ++
has the value of the current operand, but it has the side effect of incrementing that operand. Huh? Take a look at that int
code again:
int i = 7;
printf ("%d\n", i++);
printf ("%d\n", i);
As noted earlier, the output will be:
7
8
When i++
is evaluated in the first printf()
, it evaluates to 7. But the C standard guarantees that at some point before the second printf()
begins executing, the side effect of the ++
operator will have taken place. That is to say, before the second printf()
happens, i
will have been incremented as a result of the ++
operator in the first printf()
. This, by the way, is one of the few guarantees the standard gives about the timing of side effects.
In your code, then, when the expression *p++
is evaluated, it evaluates to 'H'
. But by the time you get to this:
printf ("%c", *p)
that pesky side-effect has occurred. p
has been incremented. Whoa! It no longer points to 'H'
, but to one character past 'H'
: to the 'e'
, in other words. That explains your cockneyfied output:
ello
Hence the chorus of helpful (and accurate) suggestions in the other answers: to print the Received Pronunciation "Hello"
and not its cockney counterpart, you need something like
while (*p)
printf ("%c", *p++);
So much for that. What about the rest? You ask about the meanings of these:
*ptr++
*++ptr
++*ptr
We just talked about the first, so let's look at the second: *++ptr
.
We saw in our earlier explanation that postfix increment p++
has a certain precedence, a value, and a side effect. The prefix increment ++p
has the same side effect as its postfix counterpart: it increments its operand by 1. However, it has a different precedence and a different value.
The prefix increment has lower precedence than the postfix; it has precedence 15. In other words, it has the same precedence as the dereference / indirection operator *
. In an expression like
*++ptr
what matters is not precedence: the two operators are identical in precedence. So associativity kicks in. The prefix increment and the indirection operator have right-left associativity. Because of that associativity, the operand ptr
is going to be grouped with the rightmost operator ++
before the operator more to the left, *
. In other words, the expression is going to be grouped *(++ptr)
. So, as with *ptr++
but for a different reason, here too the *
part is going to be applied to the value of the ++ptr
part.
So what is that value? The value of the prefix increment expression is the value of the operand after the increment. This makes it a very different beast from the postfix increment operator. Let's say you have:
int i = 7;
printf ("%d\n", ++i);
printf ("%d\n", i);
The output will be:
8
8
... different from what we saw with the postfix operator. Similarly, if you have:
const char* p = "Hello";
printf ("%c ", *p); // note space in format string
printf ("%c ", *++p); // value of ++p is p after the increment
printf ("%c ", *p++); // value of p++ is p before the increment
printf ("%c ", *p); // value of p has been incremented as a side effect of p++
the output will be:
H e e l // good dog
Do you see why?
Now we get to the third expression you asked about, ++*ptr
. That's the trickiest of the lot, actually. Both operators have the same precedence, and right-left associativity. This means the expression will be grouped ++(*ptr)
. The ++
part will be applied to the value of the *ptr
part.
So if we have:
char q[] = "Hello";
char* p = q;
printf ("%c", ++*p);
the surprisingly egotistical output is going to be:
I
What?! Okay, so the *p
part is going to evaluate to 'H'
. Then the ++
comes into play, at which point, it's going to be applied to the 'H'
, not to the pointer at all! What happens when you add 1 to 'H'
? You get 1 plus the ASCII value of 'H'
, 72; you get 73. Represent that as a char
, and you get the char
with the ASCII value of 73: 'I'
.
That takes care of the three expressions you asked about in your question. Here is another, mentioned in the first comment to your question:
(*ptr)++
That one is interesting too. If you have:
char q[] = "Hello";
char* p = q;
printf ("%c", (*p)++);
printf ("%c\n", *p);
it will give you this enthusiastic output:
HI
What's going on? Again, it's a matter of precedence, expression value, and side effects. Because of the parentheses, the *p
part is treated as a primary expression. Primary expressions trump everything else; they get evaluated first. And *p
, as you know, evaluates to 'H'
. The rest of the expression, the ++
part, is applied to that value. So, in this case, (*p)++
becomes 'H'++
.
What is the value of 'H'++
? If you said 'I'
, you've forgotten (already!) our discussion of value vs. side effect with postfix increment. Remember, 'H'++
evaluates to the current value of 'H'
. So that first printf()
is going to print 'H'
. Then, as a side effect, that 'H'
is going to be incremented to 'I'
. The second printf()
prints that 'I'
. And you have your cheery greeting.
All right, but in those last two cases, why do I need
char q[] = "Hello";
char* p = q;
Why can't I just have something like
/*const*/ char* p = "Hello";
printf ("%c", ++*p); // attempting to change string literal!
Because "Hello"
is a string literal. If you try ++*p
, you're trying to change the 'H'
in the string to 'I'
, making the whole string "Iello"
. In C, string literals are read-only; attempting to modify them invokes undefined behavior. "Iello"
is undefined in English as well, but that's just coincidence.
Conversely, you can't have
char p[] = "Hello";
printf ("%c", *++p); // attempting to modify value of array identifier!
Why not? Because in this instance, p
is an array. An array is not a modifiable l-value; you can't change where p
points by pre- or post- increment or decrement, because the name of the array works as though it's a constant pointer. (That's not what it actually is; that's just a convenient way to look at it.)
To sum up, here are the three things you asked about:
*ptr++ // effectively dereferences the pointer, then increments the pointer
*++ptr // effectively increments the pointer, then dereferences the pointer
++*ptr // effectively dereferences the pointer, then increments dereferenced value
And here's a fourth, every bit as much fun as the other three:
(*ptr)++ // effectively forces a dereference, then increments dereferenced value
The first and second will crash if ptr
is actually an array identifier. The third and fourth will crash if ptr
points to a string literal.
There you have it. I hope it's all crystal now. You've been a great audience, and I'll be here all week.