Equals and Comparable with Sets

OldCurmudgeon picture OldCurmudgeon · Oct 6, 2012 · Viewed 13.5k times · Source

I posted some code here which correctly solved a problem the poster had. OP wanted to remove duplicates and bring certain special items to the top of a list. I used a TreeSet with a special Comparable class which wrapped the Locale they were working with to achieve what they wanted.

I then got to thinking ... as you do ... that I was eliminating duplicates by returning 0 from the compareTo method, not by returning true from an equals implementation as one would need to do to correctly indicate a duplicate in a Set (from the definition of a Set).

I have no objection to using this technique but am I using what might be considered an undocumented feature? Am I safe to assume that doing this kind of thing going forward will continue to work?

Answer

Tomasz Nurkiewicz picture Tomasz Nurkiewicz · Oct 6, 2012

It seems like this is pretty well documented in JavaDoc of TreeSet (bold mine):

Note that the ordering maintained by a set (whether or not an explicit comparator is provided) must be consistent with equals if it is to correctly implement the Set interface. (See Comparable or Comparator for a precise definition of consistent with equals.) This is so because the Set interface is defined in terms of the equals operation, but a TreeSet instance performs all element comparisons using its compareTo (or compare) method, so two elements that are deemed equal by this method are, from the standpoint of the set, equal. The behavior of a set is well-defined even if its ordering is inconsistent with equals; it just fails to obey the general contract of the Set interface.

Here is an example of the only (?) JDK class that implements Comparable but is not consistent with equals():

Set<BigDecimal> decimals = new HashSet<BigDecimal>();
decimals.add(new BigDecimal("42"));
decimals.add(new BigDecimal("42.0"));
decimals.add(new BigDecimal("42.00"));
System.out.println(decimals);

decimals at the end have three values because 42, 42.0 and 42.00 are not equal as far as equals() is concerned. But if you replace HashSet with TreeSet, the resulting set contains only 1 item (42 - that happened to be the first one added) as all of them are considered equal when compared using BigDecimal.compareTo().

This shows that TreeSet is in a way "broken" when using types not consistent with equals(). It still works properly and all operations are well-defined - it just doesn't obey the contract of Set class - if two classes are not equal(), they are not considered duplicates.

See also