Fastest way to check if a List<String> contains a unique String

Ben picture Ben · Jul 22, 2010 · Viewed 67.8k times · Source

Basically I have about 1,000,000 strings, for each request I have to check if a String belongs to the list or not.

I'm worried about the performance, so what's the best method? ArrayList? Hash?

Answer

krock picture krock · Jul 22, 2010

Your best bet is to use a HashSet and check if a string exists in the set via the contains() method. HashSets are built for fast access via the use of Object methods hashCode() and equals(). The Javadoc for HashSet states:

This class offers constant time performance for the basic operations (add, remove, contains and size),

HashSet stores objects in hash buckets which is to say that the value returned by the hashCode method will determine which bucket an object is stored in. This way, the amount of equality checks the HashSet has to perform via the equals() method is reduced to just the other Objects in the same hash bucket.

To use HashSets and HashMaps effectively, you must conform to the equals and hashCode contract outlined in the javadoc. In the case of java.lang.String these methods have already been implemented to do this.