C language boolean expression return value

user3277859 picture user3277859 · Mar 18, 2014 · Viewed 10.6k times · Source

The C language does not have a Boolean datatype, using integers instead. Comparison operators such as == and <= return the integer value 0 for false and 1 for true. However, the if statement in C considers any nonzero value of its condition to be equivalent to true. Why the difference? Why not allow the relational operators to return any nonzero value to represent true?

Answer

Keith Thompson picture Keith Thompson · Mar 18, 2014

I believe it was an arbitrary decision, going back to C's ancestor language B.

Quoting the Users' Reference to B:

The relational operators < (less than), <= (less than or equal to), > (greater than), and >= (greater than or equal to) take integer rvalue operands. The result is one if the operands are in the given relation to one another. The result is zero otherwise.

No explanation is given of this particular choice, nor is it explained in the ANSI C Rationale or in the 1978 1st edition of Kernighan & Ritchie's book "The C Programming Language" (K&R1).

(B's ancestor language BCPL had true and false literals, with true being represented with all bits set to 1.)

The language could have been defined differently, and it still would have been internally consistent. For example, the standard could have said that the relational and equality operators yield a result of 0 if the condition is false, or any arbitrary non-zero value if the condition is true. The result could still be used correctly in an if statement, or any other context requiring a condition. And it's easy to imagine a CPU on which it's more efficient for a true value to be represented as all-bits-one rather than as 1 -- but the language standard doesn't permit that.

Several standard library functions, such as isdigit(), may return any arbitrary non-zero value to indicate a true condition, which further demonstrates that this was an arbitrary choice. (isdigit is naturally implemented via a table lookup which can yield values other than 0 and 1).

There is some added convenience in having the equality and relational operators yield 0 and 1. For example, this makes it easy to keep a count of how many conditions are true:

int count = 0;
count += x == y;
count += foo > bar;
count += this <= that;

My guess is that it was convenient to use 0 and 1 in the first B compiler, the behavior was documented, and it's been inherited up until today. Changing the definition would have broken code that depended on the previous definition.

And even if it's relatively inefficient on some systems, it's not a huge problem. The result of an equality or relational operator is not usually stored anywhere, so a compiler can represent the result however it likes as long as the behavior is consistent. It might have to generate code to normalize the result to 0 or 1 in some cases, but that's not likely to be significant.