What is the hashCode for a custom class having just two int properties?

Sophie Sperner picture Sophie Sperner · Jul 31, 2012 · Viewed 22.9k times · Source

In Java, I have a class that represents a point with int coordinates

public class Point {
    int x = -1;
    int y = -1;

    public Point (int xNew, int yNew) {
        x = xNew; y = yNew;
    }

    public boolean equals (Object o) {
        // no need for (o instanceof Point) by design
        return x == ((Point)o).x && y == ((Point)o).y;
    }
}

I'm using objects of class Point as keys in a HashMap and as elements in a HashSet.

What would be the best candidate for the hashCode function? I would make it double so that the left part is x and the right part is y, for example: x = 4, y = 12, then the hashCode returns 4.12. But by the implementation, it cannot be double, only int.

This is not an option:

public int hashCode() {
    // no need to check for exception parseInt since x and y are valid by design
    return Integer.parseInt(Integer.toString(x) + Integer.toString(y));
}

because values x and y can be too long, so that together they will not be converted.

Answer

Jon Skeet picture Jon Skeet · Jul 31, 2012

You can't change the type of hashCode, nor should you want to.

I'd just go with something like:

public int hashCode() {
    return x * 31 + y;
}

Note that this means that (a, b) is different to (b, a) for most cases (unlike e.g. adding or XOR-ing). This can be useful if you often end up with keys for the "switched" values in real life.

It isn't unique - but hash codes don't have to be. They just have to be the same for equal values (for correctness), and (for efficiency) "usually" different for non-equal values, with a reasonable distribution.

In general, I usually follow the same kind of pattern as Josh Bloch suggests in Effective Java:

public int hashCode() {
    int hash = 17;
    hash = hash * 31 + field1Hash;
    hash = hash * 31 + field2Hash;
    hash = hash * 31 + field3Hash;
    hash = hash * 31 + field4Hash;
    ...
    return hash;
}

Where field1Hash would be the hash code for reference type fields (or 0 for a null reference), the int itself for int values, some sort of hash from 64 bits to 32 for long etc.

EDIT: I can't remember the details of why 31 and 17 work well together. The fact that they're both prime may be useful - but from what I remember, the maths behind why hashes like this are generally reasonable (though not as good as hashes where the distribution of likely values is known in advance) is either difficult or not well understood. I know that multiplying by 31 is cheap (shift left 5 and subtract the original value)...