Boolean.hashCode()

user471011 picture user471011 · Oct 12, 2010 · Viewed 15.8k times · Source

The hashCode() method of class Boolean is implemented like this:

public int hashCode() {
    return value ? 1231 : 1237;
}

Why does it use 1231 and 1237? Why not something else?

Answer

aioobe picture aioobe · Oct 12, 2010

1231 and 1237 are just two (sufficiently large) arbitrary prime numbers. Any other two large prime numbers would do fine.

Why primes?
Suppose for a second that we picked composite numbers (non-primes), say 1000 and 2000. When inserting booleans into a hash table, true and false would go into bucket 1000 % N resp 2000 % N (where N is the number of buckets).

Now notice that

  • 1000 % 8 same bucket as 2000 % 8
  • 1000 % 10 same bucket as 2000 % 10
  • 1000 % 20 same bucket as 2000 % 20
  • ....

in other words, it would lead to many collisions.

This is because the factorization of 1000 (23, 53) and the factorization of 2000 (24, 53) have so many common factors. Thus prime numbers are chosen, since they are unlikely to have any common factors with the bucket size.

Why large primes. Wouldn't 2 and 3 do?
When computing hash codes for composite objects it's common to add the hash codes for the components. If too small values are used in a hash set with a large number of buckets there's a risk of ending up with an uneven distribution of objects.

Do collisions matter? Booleans just have two different values anyway?
Maps can contain booleans together with other objects. Also, as pointed out by Drunix, a common way to create hash functions of composite objects is to reuse the subcomponents hash code implementations in which case it's good to return large primes.

Related questions: