I am able to insert duplicate entries in TreeSet. How to overcome this

Prasad picture Prasad · Jul 12, 2013 · Viewed 12.9k times · Source

I have a class called Employee which has employeeName and employeeId as its member variables.I am creating new Employee objects and then adding it into a TreeSet where I want to sort it based on the employeeId. But I consider 2 Employee objects to be equal if they have the same employeeName. Set does not allow duplicates. But here I can observe a strange behavior. This is my code.(I am not using getters and setters here. I am directly accessing the member variables.)

package secondOne;

import java.util.Set;
import java.util.TreeSet;

class Employee implements Comparable<Employee> {

    String employeeName;
    int employeeId;

    public Employee(String name, int id) {
        this.employeeName = name;
        this.employeeId = id;
    }

    public int compareTo(Employee emp) {
        //return this.employeeName.compareTo(emp.employeeName);
        return (this.employeeId - emp.employeeId);
    }

    @Override
    public String toString() {
        return ("Name is: " + employeeName + " Emp id is: " + employeeId);
    }

    @Override
    public boolean equals(Object emp) {
        if (emp instanceof Employee && ((Employee) emp).employeeName == this.employeeName) {
            return true;
        }
        return false;
    }

}

public class TestingSetsWithComparable {
    /**
     * @param args
     */
    public static void main(String[] args) {
        Employee e1 = new Employee("A", 1);
        Employee e2 = new Employee("A", 2);
        Employee e3 = new Employee("B", 3);

        Set<Employee> set = new TreeSet<Employee>();
        set.add(e1);
        set.add(e2);
        set.add(e3);
        System.out.println(set);
    }
}

Here the output for the above code is,
[Name is: A Emp id is: 1, Name is: A Emp id is: 2, Name is: B Emp id is: 3]

My first question is, in equals() method, I consider 2 Employee objests to be equal if they have same employeeName but in compareTo method, I am using employeeId to sort. In this case, output is showing 2 entries for employeeName 'A'. How is TreeSet allowing a duplicate entry when I consider 2 objects to be same if they have the same employeeName. How is this possible..? And Second question is, in the compareTo method, if I use employeeName to sort, then I dont get the second repeated entry for the same name. The output in this second case is
[Name is: A Emp id is: 1, Name is: B Emp id is: 3]

Why is it so..?

Answer

Luiggi Mendoza picture Luiggi Mendoza · Jul 12, 2013

The problem is here:

((Employee)emp).employeeName== this.employeeName

You must compare Strings using equals method:

((Employee)emp).employeeName.equals(this.employeeName)

Refer to How do I compare strings in Java?

Also, since you're overriding equals method, it would be good if you override hashCode method as well, as stated in Object#equals contract:

Note that it is generally necessary to override the hashCode method whenever this method is overridden, so as to maintain the general contract for the hashCode method, which states that equal objects must have equal hash codes.

Additional: Since you're using TreeSet, it will use the compareTo method instead of the equals and hashCode methods. This is because TreeSet implements SortedSet interface. Refer to SortedSet javadoc (emphasis mine):

A Set that further provides a total ordering on its elements. The elements are ordered using their natural ordering(i.e. implementing Comparable<T>), or by a Comparator typically provided at sorted set creation time.

You should implement this method in the accordingly to your needs:

public int compareTo(Employee emp) {
    if (this.employeeName.equals(emp.employeeName)) {
        return 0;
    }
    //removed the comparison by subtraction since it will behave wrongly on int overflow
    return new Integer(this.employeeId).compareTo(emp.employeeId);
}

Since you're comparing Strings, I would recommend using StringUtils class from Apache Commons Lang that provides helper methods to avoid null checks and others.