Difference between java 8 streams and parallel streams

Yogi Joshi picture Yogi Joshi · Aug 2, 2015 · Viewed 20.8k times · Source

I wrote code using Java 8 streams and parallel streams for the same functionality with a custom collector to perform an aggregation function. When I see CPU usage using htop, it shows all CPU cores being used for both 'streams' and 'parallel streams' version. So, it seems when list.stream() is used, it also uses all CPUs. Here, what is the precise difference between parallelStream() and stream() in terms of usage of multi-core.

Answer

Hoopje picture Hoopje · Aug 2, 2015

Consider the following program:

import java.util.ArrayList;
import java.util.List;

public class Foo {
    public static void main(String... args) {
        List<Integer> list = new ArrayList<>();
        for (int i = 0; i < 1000; i++) {
            list.add(i);
        }
        list.stream().forEach(System.out::println);
    }
}

You will notice that this program will output the numbers from 0 to 999 sequentially, in the order in which they are in the list. If we change stream() to parallelStream() this is not the case anymore (at least on my computer): all number are written, but in a different order. So, apparently, parallelStream() indeed uses multiple threads.

The htop is explained by the fact that even single-threaded applications are divided over mutliple cores by most modern operating systems (parts of the same thread may run on several cores, but of course not at the same time). So if you see that a process used more than one core, this does not mean necessarily that the program uses multiple threads.

Also the performance may not improve when using multiple threads. The cost of synchronization may nihilite the gains of using multiple threads. For simple testing scenarios this is often the case. For example, in the above example, System.out is synchronized. So, effectively, only number can be written at the same time, although multiple threads are used.