For example in the code below:
public int commonTwo(String[] a, String[] b)
{
Set common = new HashSet<String>(Arrays.asList(a));
common.retainAll(new HashSet<String>(Arrays.asList(b)));
return common.size();
}
Lets take a peruse at the code. The method retainAll
is inherited from AbstractCollection
and (at least in OpenJDK) looks like this:
public boolean retainAll(Collection<?> c) {
boolean modified = false;
Iterator<E> it = iterator();
while (it.hasNext()) {
if (!c.contains(it.next())) {
it.remove();
modified = true;
}
}
return modified;
}
There is one big this to note here, we loop over this.iterator()
and call c.contains
. So the time complexity is n
calls to c.contains
where n = this.size()
and at most n
calls to it.remove()
.
This important thing is that the contains
method is called on the other Collection
and so the complexity is dependant upon the complexity of the other Collection
contains
.
So, whilst:
Set<String> common = new HashSet<>(Arrays.asList(a));
common.retainAll(new HashSet<>(Arrays.asList(b)));
Would be O(a.length)
, as HashSet.contains
and HashSet.remove
are both O(1)
(amortized).
If you were to call
common.retainAll(Arrays.asList(b));
Then due to the O(n)
contains
on Arrays.ArrayList
this would become O(a.length * b.length)
- i.e. by spending O(n)
copying the array to a HashSet
you actually make the call to retainAll
much faster.
As far as space complexity goes, no additional space (beyond the Iterator
) is required by retainAll
, but your invocation is actually quite expensive space-wise as you allocate two new HashSet
implementations which are actually fully fledged HashMap
.
Two further things can be noted:
HashSet
from the elements in a
- a cheaper collection that also has O(1)
remove from the middle such as an LinkedList
can be used. (cheaper in memory and also build time - a hash table is not built)b.size()
.