Why is unsigned integer overflow defined behavior but signed integer overflow isn't?

Anthony Vallée-Dubois picture Anthony Vallée-Dubois · Aug 12, 2013 · Viewed 47k times · Source

Unsigned integer overflow is well defined by both the C and C++ standards. For example, the C99 standard (§6.2.5/9) states

A computation involving unsigned operands can never overflow, because a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting type.

However, both standards state that signed integer overflow is undefined behavior. Again, from the C99 standard (§3.4.3/1)

An example of undefined behavior is the behavior on integer overflow

Is there an historical or (even better!) a technical reason for this discrepancy?

Answer

Pascal Cuoq picture Pascal Cuoq · Aug 12, 2013

The historical reason is that most C implementations (compilers) just used whatever overflow behaviour was easiest to implement with the integer representation it used. C implementations usually used the same representation used by the CPU - so the overflow behavior followed from the integer representation used by the CPU.

In practice, it is only the representations for signed values that may differ according to the implementation: one's complement, two's complement, sign-magnitude. For an unsigned type there is no reason for the standard to allow variation because there is only one obvious binary representation (the standard only allows binary representation).

Relevant quotes:

C99 6.2.6.1:3:

Values stored in unsigned bit-fields and objects of type unsigned char shall be represented using a pure binary notation.

C99 6.2.6.2:2:

If the sign bit is one, the value shall be modified in one of the following ways:

— the corresponding value with sign bit 0 is negated (sign and magnitude);

— the sign bit has the value −(2N) (two’s complement);

— the sign bit has the value −(2N − 1) (one’s complement).


Nowadays, all processors use two's complement representation, but signed arithmetic overflow remains undefined and compiler makers want it to remain undefined because they use this undefinedness to help with optimization. See for instance this blog post by Ian Lance Taylor or this complaint by Agner Fog, and the answers to his bug report.