I think the question is self explanatory, I guess it probably has something to do with overflow but still I do not quite get it. What is happening, bitwise, under the hood?
Why does -(-2147483648) = -2147483648
(at least while compiling in C)?
The expression -(-2147483648)
is perfectly defined in C, however it may be not obvious why it is this way.
When you write -2147483648
, it is formed as unary minus operator applied to integer constant. If 2147483648
can't be expressed as int
, then it s is represented as long
or long long
* (whichever fits first), where the latter type is guaranteed by the C Standard to cover that value†.
To confirm that, you could examine it by:
printf("%zu\n", sizeof(-2147483648));
which yields 8
on my machine.
The next step is to apply second -
operator, in which case the final value is 2147483648L
(assuming that it was eventually represented as long
). If you try to assign it to int
object, as follows:
int n = -(-2147483648);
then the actual behavior is implementation-defined. Referring to the Standard:
C11 §6.3.1.3/3 Signed and unsigned integers
Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.
The most common way is to simply cut-off the higher bits. For instance, GCC documents it as:
For conversion to a type of width N, the value is reduced modulo 2^N to be within range of the type; no signal is raised.
Conceptually, the conversion to type of width 32 can be illustrated by bitwise AND operation:
value & (2^32 - 1) // preserve 32 least significant bits
In accordance with two's complement arithmetic, the value of n
is formed with all zeros and MSB (sign) bit set, which represents value of -2^31
, that is -2147483648
.
int
object:If you try to negate int
object, that holds value of -2147483648
, then assuming two's complement machine, the program will exhibit undefined behavior:
n = -n; // UB if n == INT_MIN and INT_MAX == 2147483647
C11 §6.5/5 Expressions
If an exceptional condition occurs during the evaluation of an expression (that is, if the result is not mathematically defined or not in the range of representable values for its type), the behavior is undefined.
*) In withdrawed C90 Standard, there was no long long
type and the rules were different. Specifically, sequence for unsuffixed decimal was int
, long int
, unsigned long int
(C90 §6.1.3.2 Integer constants).
†) This is due to LLONG_MAX
, which must be at least +9223372036854775807
(C11 §5.2.4.2.1/1).