collecting from parallel stream in java 8

Vipul Goyal picture Vipul Goyal · May 20, 2017 · Viewed 15.3k times · Source

I want to take an input and apply parallel stream on that, then I want output as list. Input could be any List or any collection on which we can apply streams.

My concerns here is that if we want output as map them we have an option from java is like

list.parallelStream().collect(Collectors.toConcurrentMap(args))

But there is no option that I can see to collect from parallel stream in thread safe way to provide list as output. I see one more option there to use

list.parallelStream().collect(Collectors.toCollection(<Concurrent Implementation>))

in this way we can provide various concurrent implementations in collect method. But I think there is only CopyOnWriteArrayList List implementation is present in java.util.concurrent. We could use various queue implementation here but those will not be like list. What I mean here is that we can workaround to get the list.

Could you please guide me what is the best way if I want the output as list?

Note: I could not find any other post related to this, any reference would be helpful.

Answer

Andreas picture Andreas · May 20, 2017

The Collection object used to receive the data being collected does not need to be concurrent. You can give it a simple ArrayList.

That is because the collection of values from a parallel stream is not actually collected into a single Collection object. Each thread will collect their own data, and then all sub-results will be merged into a single final Collection object.

This is all well-documented in the Collector javadoc, and the Collector is the parameter you're giving to the collect() method:

<R,A> R collect(Collector<? super T,A,R> collector)