A good hash function to use in interviews for integer numbers, strings?

phoenix picture phoenix · May 21, 2011 · Viewed 13.5k times · Source


I have come across situations in an interview where I needed to use a hash function for integer numbers or for strings. In such situations which ones should we choose ? I've been wrong in these situations because I end up choosing the ones which have generate lot of collisions but then hash functions tend to be mathematical that you cannot recollect them in an interview. Are there any general recommendations so atleast the interviewer is satisfied with your approach for integer numbers or string inputs? Which functions would be adequate for both inputs in an "interview situation"

Answer

Premraj picture Premraj · May 21, 2011

Here is a simple recipe from Effective java page 33:

  1. Store some constant nonzero value, say, 17, in an int variable called result.
  2. For each significant field f in your object (each field taken into account by the equals method, that is), do the following:
    1. Compute an int hash code c for the field:
      • If the field is a boolean, compute (f ? 1 : 0).
      • If the field is a byte, char, short, or int, compute (int) f.
      • If the field is a long, compute (int) (f ^ (f >>> 32)).
      • If the field is a float, compute Float.floatToIntBits(f).
      • If the field is a double, compute Double.doubleToLongBits(f), and then hash the resulting long as in step 2.1.iii.
      • If the field is an object reference and this class’s equals method compares the field by recursively invoking equals, recursively invoke hashCode on the field. If a more complex comparison is required, compute a “canonical representation” for this field and invoke hashCode on the canonical representation. If the value of the field is null, return 0 (or some other constant, but 0 is traditional). 48 CHAPTER 3 METHODS COMMON TO ALL OBJECTS
      • If the field is an array, treat it as if each element were a separate field. That is, compute a hash code for each significant element by applying these rules recursively, and combine these values per step 2.b. If every element in an array field is significant, you can use one of the Arrays.hashCode methods added in release 1.5.
    2. Combine the hash code c computed in step 2.1 into result as follows: result = 31 * result + c;
  3. Return result.
  4. When you are finished writing the hashCode method, ask yourself whether equal instances have equal hash codes. Write unit tests to verify your intuition! If equal instances have unequal hash codes, figure out why and fix the problem.