Java contains vs anyMatch behaviour

Tranquility picture Tranquility · Feb 4, 2016 · Viewed 53.8k times · Source

So if I have a Name object and have an ArrayList of type Name (names), and I want to ascertain whether my list of names contains a given Name object (n), I could do it two ways:

boolean exists = names.contains(n);

or

boolean exists = names.stream().anyMatch(x -> x.equals(n));

I was considering if these two would behave the same and then thought about what happens if n was assigned null?

For contains, as I understand, if the argument is null, then it returns true if the list contains null. How would I achieve this anyMatch - would it be by using Objects.equals(x, n)?

If that is how it works, then which approach is more efficient - is it anyMatch as it can take advantage of laziness and parallelism?

Answer

Marco13 picture Marco13 · Feb 4, 2016

The problem with the stream-based version is that if the collection (and thus its stream) contains null elements, then the predicate will throw a NullPointerException when it tries to call equals on this null object.

This could be avoided with

boolean exists = names.stream().anyMatch(x -> Objects.equals(x, n));

But there is no practical advantage to be expected for the stream-based solution in this case. Parallelism might bring an advantage for really large lists, but one should not casually throw in some parallel() here and there assuming that it may make things faster. First, you should clearly identify the actual bottlenecks.

And in terms of readability, I'd prefer the first, classical solution here. If you want to check whether the list of names.contains(aParticularValue), you should do this - it just reads like prose and makes the intent clear.

EDIT

Another advantage of the contains approach was mentioned in the comments and in the other answer, and that may be worth mentioning here: If the type of the names collection is later changed, for example, to be a HashSet, then you'll get the faster contains-check (with O(1) instead of O(n)) for free - without changing any other part of the code. The stream-based solution would then still have to iterate over all elements, and this could have a significantly lower performance.