Java 8 reduce BinaryOperator what is used for?

chiperortiz picture chiperortiz · Jun 1, 2014 · Viewed 9.3k times · Source

i am currently reading the O'reilly Java 8 Lambdas is a really good book. i came across with a example like this.

i have a

private final BiFunction<StringBuilder,String,StringBuilder>accumulator=
(builder,name)->{if(builder.length()>0)builder.append(",");builder.append("Mister:").append(name);return builder;};

final Stream<String>stringStream = Stream.of("John Lennon","Paul Mccartney"
,"George Harrison","Ringo Starr");
final StringBuilder reduce = stringStream
    .filter(a->a!=null)
    .reduce(new StringBuilder(),accumulator,(left,right)->left.append(right));
 System.out.println(reduce);
 System.out.println(reduce.length());

this produce the right output.

Mister:John Lennon,Mister:Paul Mccartney,Mister:George Harrison,Mister:Ringo Starr

my question is regarded the reduce method the last parameter which is a BinaryOperator

my question is which this parameter is used for? if i change by

.reduce(new StringBuilder(),accumulator,(left,right)->new StringBuilder());

the ouput is the same if i pass NULL then N.P.E is returned.

what for this parameter is used for?

UPDATE

why if i run it on parallelStream i am receiving differents results?

first run.

returned StringBuilder length = 420

second run

returned StringBuilder length = 546

third run

returned StringBuilder length = 348

and so on? why is this... should not return all the values at each iteration?

any help is hugely grateful.

thanks.

Answer

nosid picture nosid · Jun 1, 2014

The method reduce in the interface Stream is overloaded. The parameters for the method with three arguments are:

  • identity
  • accumulator
  • combiner

The combiner supports parallel execution. Apparently, it is not used for sequential streams. However, there is no such guarantee. If you change your streams into parallel stream, I guess you will see a difference:

Stream<String>stringStream = Stream.of(
    "John Lennon", "Paul Mccartney", "George Harrison", "Ringo Starr")
    .parallel();

Here is an example of how the combiner can be used to transform a sequential reduction into a reduction, that supports parallel execution. There is a stream with four Strings and acc is used as an abbreviation for accumulator.apply. Then the result of the reduction can be computed as follows:

acc(acc(acc(acc(identity, "one"), "two"), "three"), "four");

With a compatible combiner, the above expression can be transformed into the following expression. Now it is possible to execute the two sub-expressions in different threads.

combiner.apply(
    acc(acc(identity, "one"), "two"),
    acc(acc(identity, "three"), "four"));

Regarding your second question, I use a simplified accumulator to explain the problem:

BiFunction<StringBuilder,String,StringBuilder> accumulator =
    (builder,name) -> builder.append(name);

According to the Javadoc for Stream::reduce, the accumulator has to be associative. In this case, that would imply, that the following two expressions return the same result:

acc(acc(acc(identity, "one"), "two"), "three")  
acc(acc(identity, "one"), acc(acc(identity, "two"), "three"))

That's not true for the above accumulator. The problem is, that you are mutating the object referenced by identity. That's a bad idea for the reduce operation. Here are two alternative implementations which should work:

// identity = ""
BiFunction<String,String,String> accumulator = String::concat;

// identity = null
BiFunction<StringBuilder,String,StringBuilder> accumulator =
    (builder,name) -> builder == null
        ? new StringBulder(name) : builder.append(name);