How does HashSet not allow duplicates?

Zeeshan picture Zeeshan · Mar 6, 2014 · Viewed 20k times · Source

I was going through the add method of HashSet. It is mentioned that

If this set already contains the element, the call leaves the set unchanged and returns false.

But the add method is internally saving the values in HashMap

public boolean add(E e) {
    return map.put(e, PRESENT)==null;
}

The put method of HashMap states that

Associates the specified value with the specified key in this map. If the map previously contained a mapping for the key, the old value is replaced.

So if the put method of HashMap replaces the old value, how the HashSet add method leaves the set unchanged in case of duplicate elements?

Answer

yshavit picture yshavit · Mar 6, 2014

PRESENT is just a dummy value -- the set doesn't really care what it is. What the set does care about is the map's keys. So the logic goes like this:

Set.add(a):
  map.put(a, PRESENT) // so far, this is just what you said
    the key "a" is in the map, so...
      keep the "a" key, but map its value to the PRESENT we just passed in
      also, return the old value (which we'll call OLD)
  look at the return value: it's OLD, != null. So return false.

Now, the fact that OLD == PRESENT doesn't matter -- and note that Map.put doesn't change the key, just the value mapped to that key. Since the map's keys are what the Set really cares about, the Set is unchanged.

In fact, there has been some change to the underlying structures of the Set -- it replaced a mapping of (a, OLD) with (a, PRESENT). But that's not observable from outside the Set's implementation. (And as it happens, that change isn't even a real change, since OLD == PRESENT).