In Java, I have a class that represents a point with int coordinates
public class Point {
int x = -1;
int y = -1;
public Point (int xNew, int yNew) {
x = xNew; y = yNew;
}
public boolean equals (Object o) {
// no need for (o instanceof Point) by design
return x == ((Point)o).x && y == ((Point)o).y;
}
}
I'm using objects of class Point
as keys in a HashMap
and as elements in a HashSet
.
What would be the best candidate for the hashCode
function? I would make it double so that the left part is x and the right part is y, for example:
x = 4, y = 12
, then the hashCode
returns 4.12
. But by the implementation, it cannot be double, only int.
This is not an option:
public int hashCode() {
// no need to check for exception parseInt since x and y are valid by design
return Integer.parseInt(Integer.toString(x) + Integer.toString(y));
}
because values x
and y
can be too long, so that together they will not be converted.
You can't change the type of hashCode
, nor should you want to.
I'd just go with something like:
public int hashCode() {
return x * 31 + y;
}
Note that this means that (a, b) is different to (b, a) for most cases (unlike e.g. adding or XOR-ing). This can be useful if you often end up with keys for the "switched" values in real life.
It isn't unique - but hash codes don't have to be. They just have to be the same for equal values (for correctness), and (for efficiency) "usually" different for non-equal values, with a reasonable distribution.
In general, I usually follow the same kind of pattern as Josh Bloch suggests in Effective Java:
public int hashCode() {
int hash = 17;
hash = hash * 31 + field1Hash;
hash = hash * 31 + field2Hash;
hash = hash * 31 + field3Hash;
hash = hash * 31 + field4Hash;
...
return hash;
}
Where field1Hash
would be the hash code for reference type fields (or 0 for a null reference), the int
itself for int values, some sort of hash from 64 bits to 32 for long
etc.
EDIT: I can't remember the details of why 31 and 17 work well together. The fact that they're both prime may be useful - but from what I remember, the maths behind why hashes like this are generally reasonable (though not as good as hashes where the distribution of likely values is known in advance) is either difficult or not well understood. I know that multiplying by 31 is cheap (shift left 5 and subtract the original value)...