When is CopyOnWriteArraySet useful to achieve thread-safe HashSet?

coderz picture coderz · Mar 25, 2015 · Viewed 8.2k times · Source

In Java, there is thread-safe version HashMap named ConcurrentHashMap and thread-safe version TreeMap named ConcurrentSkipListMap, but there is no ConcurrentHashSet for HashSet.

Instead, there are usually 4 ways to use thread-safe Set:

  1. Set<String> mySet = Collections.newSetFromMap(new ConcurrentHashMap<String, Boolean>());
  2. Set<String> s = Collections.synchronizedSet(new HashSet<String>());
  3. ConcurrentSkipListSet<E>
  4. CopyOnWriteArraySet<E>

1 use keySet() of ConcurrentHashMap to achieve both Set and thread-safe.

2 use synchronized way, it seems this way is not recommended.

3 is based on ConcurrentSkipListMap and is widely used.

4 is based on CopyOnWriteArrayList, thus it shares the same basic properties of CopyOnWriteArrayList. Following is select from CopyOnWriteArraySet doc: http://docs.oracle.com/javase/8/docs/api/java/util/concurrent/CopyOnWriteArraySet.html

  • It is best suited for applications in which set sizes generally stay small, read-only operations vastly outnumber mutative operations, and you need to prevent interference among threads during traversal.
  • It is thread-safe.
  • Mutative operations (add, set, remove, etc.) are expensive since they usually entail copying the entire underlying array.
  • Iterators do not support the mutative remove operation.
  • Traversal via iterators is fast and cannot encounter interference from other threads.
  • Iterators rely on unchanging snapshots of the array at the time the iterators were constructed.

Since 1 and 3 are commonly used, why does CopyOnWriteArraySet exist? When is CopyOnWriteArraySet useful?

Added: CopyOnWriteArraySet is based on CopyOnWriteArrayList, and the contains operation in List data structure is O(n), while Set data structure is for high performance contains operation, could anybody explain this?

Answer

Peter Lawrey picture Peter Lawrey · Mar 25, 2015

It is useful when you have a small set of element for a thread safe collection.

One example is a Set of listeners. You need to ensure uniqueness and iterate over them efficiently.

BTW CopyOnWriteArraySet has the lowest overhead on a per reference basis. It can be as little as 1/6 the size of the other collections. This is particularly useful if you have a lot of them.

while Set data structure is for high performance contains operation, could anybody explain this?

COWAS is more efficient in terms of memory and it's contains is faster for small collections than the alternatives. What is "high performance" depends on the use case.