What is the main difference between Hashset, Treeset and LinkedHashset, Hashmap and how does it work in Java?

Achiever picture Achiever · Nov 26, 2013 · Viewed 14.9k times · Source

I just understand that LinkedHashSet does not allows duplicate elements when it is inserting. But, I dont understand how does Hashset works in Hava? I know a bit that Hashtable is used in Hashset so the hashtable used to store the elements and here also does not allow the duplicate elements. Then, Treeset is also similar to Hashset it also does not allows duplicate entries so unique elements will be seen and it follows ascending order.

I have one more doubt regarding HashMap - Hashmap does not maintains order. It may have one null key and multiple null values. I just dont understand this and what does it mean actually? Any practical example for this?

I know a bit, Hashmap used to work based on this - Key and values used to put in buckets also bucket has unique numbers. So that, can identify and get the key and value from the buckets. When I put the key/value pair in the bucket of which identifier is the hash code of the key.

For an eg: Hash code of the key is 101 so it is stored in bucket 101. One bucket can store more than key and value pairs. Suppose take an example as Object1 is "A", object2 is "A"and object3 is "B" then it has a same Hash code. So, it stores the different objects by sharing the same Hashcode in same bucket. My doubt is, objects with same Hashcode should be equal and different objects should have different Hashcodes?

This is the program using HashSet:

    import java.util.*;
    public class Simple{
    public static void main(String[] args){
    HashSet hh=new HashSet();
    hh.add("D");
    hh.add("A");
    hh.add("B");
    hh.add("C");
    hh.add("a");        
    System.out.println("Checking the size is:"+hh.size()+"");
    System.out.println(hh);

    Iterator i=hh.iterator();
    while(i.hasNext()){
    System.out.println(i.next());
    }      
    }
    }

Output is,

Checking the size is:5
[D, A, B, a, C]
D
A
B
a
C

My doubt is, why "a" is inserting in between "B" and "C".

Now, I am using LinkedHashSet so,

public class Simple{
public static void main(String[] args){
    LinkedHashSet hh=new LinkedHashSet();
            hh.add("D");
            hh.add("A");
    hh.add("B");
    hh.add("C");
            hh.add("a");  

        System.out.println("Checking the size is:"+hh.size()+"");
    System.out.println(hh);

    Iterator i=hh.iterator();
    while(i.hasNext()){
        System.out.println(i.next());
    }      
}
}

I just understand that, it follows insertion order and it avoids duplicate elements. So the output is,

Checking the size is:5
[D, A, B, C, a]
D
A
B
C
a

Now, Using Treeset:

import java.util.*;
public class Simple{
public static void main(String[] args){
    TreeSet hh=new TreeSet();
            hh.add("1");
            hh.add("5");
            hh.add("3");
            hh.add("5");
            hh.add("2");
            hh.add("7");  

System.out.println("Checking the size is:"+hh.size()+"");
System.out.println(hh);

    Iterator i=hh.iterator();
    while(i.hasNext()){
        System.out.println(i.next());
    }      
}
}

Here, I just understand that - Treeset follows ascending order.

The output is,
Checking the size is:5
[1, 2, 3, 5, 7]
1
2
3
5
7

Then my doubt is, how does Hashset works in Java? And I know that LinkedHashset follows doubly linkedlist. If it uses doubly linked list then how does it stores the elements? What does mean by doubly linkedlist and how does it works? Then where all these three Hashset, Treeset, LinkedHashset would be used in Java and which one has better performance in Java?

Answer

Stephen C picture Stephen C · Nov 26, 2013

My doubt is, why "a" is inserting in between "B" and "C".

A TreeSet orders the entries.

A LinkedHashSet preserves the insertion order.

A HashSet does not preserve the order of insertion, and is does not sort/order the entries. That means that when you iterate over the set, the entries are returned in an order that is hard to fathom ... and of no practical significance. There is no particular "reason" that "a" is inserted at that point. That's just how it turned out ... given the set of input keys and the order in which they were inserted.

My only doubt is, how does Hashset works in Java.

It is implemented a hash table. Read the Wikipedia page on hash tables for a general overview, and the source code of java.util.HashMap and java.util.HashSet for the details.

The short answer is that HashSet and HashMap are both a hash table implemented as an array of hash chains.

And I know that, LinkedHashset follows doubly linkedlist. If it uses doubly linked list then how does it stores the elements?

LinkedHashSet is essentially a hash table with an additional linked list that records the insertion order. The elements are stored in the main hash table ... and that is what provides fast lookup. Again, refer to the source code for details.

What does mean by doubly linkedlist and how does it works?

Read the article in Wikipedia on doubly linked lists.


Then where all these three Hashset, Treeset, Linkedhashset would be used in Java and which one has better performance in java?

There are a number of things to think about when choosing between these three classes (and others):

  • Do they provide the required functionality. For example, we've already seen that they have different behaviour with respect to the order of iteration.

  • Do they have the required concurrency properties? For example, are they thread-safe? do they deal with contention? do they allow concurrent modification?

  • How much space do they require?

  • What are the performance (time) characteristics.

On the last two points?

  • A TreeSet uses the least space, and a LinkedHashSet uses the most.

  • A HashSet tends to be fastest for lookup, insertion and deletion for larger sets, and a TreeSet tends to be slowest.