How does this piece of code determine array size without using sizeof( )?

janojlic picture janojlic · May 15, 2019 · Viewed 8.4k times · Source

Going through some C interview questions, I've found a question stating "How to find the size of an array in C without using the sizeof operator?", with the following solution. It works, but I cannot understand why.

#include <stdio.h>

int main() {
    int a[] = {100, 200, 300, 400, 500};
    int size = 0;

    size = *(&a + 1) - a;
    printf("%d\n", size);

    return 0;
}

As expected, it returns 5.

edit: people pointed out this answer, but the syntax does differ a bit, i.e. the indexing method

size = (&arr)[1] - arr;

so I believe both questions are valid and have a slightly different approach to the problem. Thank you all for the immense help and thorough explanation!

Answer

John Bode picture John Bode · May 15, 2019

When you add 1 to a pointer, the result is the location of the next object in a sequence of objects of the pointed-to type (i.e., an array). If p points to an int object, then p + 1 will point to the next int in a sequence. If p points to a 5-element array of int (in this case, the expression &a), then p + 1 will point to the next 5-element array of int in a sequence.

Subtracting two pointers (provided they both point into the same array object, or one is pointing one past the last element of the array) yields the number of objects (array elements) between those two pointers.

The expression &a yields the address of a, and has the type int (*)[5] (pointer to 5-element array of int). The expression &a + 1 yields the address of the next 5-element array of int following a, and also has the type int (*)[5]. The expression *(&a + 1) dereferences the result of &a + 1, such that it yields the address of the first int following the last element of a, and has type int [5], which in this context "decays" to an expression of type int *.

Similarly, the expression a "decays" to a pointer to the first element of the array and has type int *.

A picture may help:

int [5]  int (*)[5]     int      int *

+---+                   +---+
|   | <- &a             |   | <- a
| - |                   +---+
|   |                   |   | <- a + 1
| - |                   +---+
|   |                   |   |
| - |                   +---+
|   |                   |   |
| - |                   +---+
|   |                   |   |
+---+                   +---+
|   | <- &a + 1         |   | <- *(&a + 1)
| - |                   +---+
|   |                   |   |
| - |                   +---+
|   |                   |   |
| - |                   +---+
|   |                   |   |
| - |                   +---+
|   |                   |   |
+---+                   +---+

This is two views of the same storage - on the left, we're viewing it as a sequence of 5-element arrays of int, while on the right, we're viewing it as a sequence of int. I also show the various expressions and their types.

Be aware, the expression *(&a + 1) results in undefined behavior:

...
If the result points one past the last element of the array object, it shall not be used as the operand of a unary * operator that is evaluated.

C 2011 Online Draft, 6.5.6/9