Why is -(-2147483648) = - 2147483648 in a 32-bit machine?

Lesscomfortable picture Lesscomfortable · Feb 25, 2017 · Viewed 19.7k times · Source

I think the question is self explanatory, I guess it probably has something to do with overflow but still I do not quite get it. What is happening, bitwise, under the hood?

Why does -(-2147483648) = -2147483648 (at least while compiling in C)?

Answer

Grzegorz Szpetkowski picture Grzegorz Szpetkowski · Feb 26, 2017

Negating an (unsuffixed) integer constant:

The expression -(-2147483648) is perfectly defined in C, however it may be not obvious why it is this way.

When you write -2147483648, it is formed as unary minus operator applied to integer constant. If 2147483648 can't be expressed as int, then it s is represented as long or long long* (whichever fits first), where the latter type is guaranteed by the C Standard to cover that value.

To confirm that, you could examine it by:

printf("%zu\n", sizeof(-2147483648));

which yields 8 on my machine.

The next step is to apply second - operator, in which case the final value is 2147483648L (assuming that it was eventually represented as long). If you try to assign it to int object, as follows:

int n = -(-2147483648);

then the actual behavior is implementation-defined. Referring to the Standard:

C11 §6.3.1.3/3 Signed and unsigned integers

Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.

The most common way is to simply cut-off the higher bits. For instance, GCC documents it as:

For conversion to a type of width N, the value is reduced modulo 2^N to be within range of the type; no signal is raised.

Conceptually, the conversion to type of width 32 can be illustrated by bitwise AND operation:

value & (2^32 - 1) // preserve 32 least significant bits

In accordance with two's complement arithmetic, the value of n is formed with all zeros and MSB (sign) bit set, which represents value of -2^31, that is -2147483648.

Negating an int object:

If you try to negate int object, that holds value of -2147483648, then assuming two's complement machine, the program will exhibit undefined behavior:

n = -n; // UB if n == INT_MIN and INT_MAX == 2147483647

C11 §6.5/5 Expressions

If an exceptional condition occurs during the evaluation of an expression (that is, if the result is not mathematically defined or not in the range of representable values for its type), the behavior is undefined.

Additional references:


*) In withdrawed C90 Standard, there was no long long type and the rules were different. Specifically, sequence for unsuffixed decimal was int, long int, unsigned long int (C90 §6.1.3.2 Integer constants).

†) This is due to LLONG_MAX, which must be at least +9223372036854775807 (C11 §5.2.4.2.1/1).