Multiple aggregations in Spark Structured Streaming

Kaptrain picture Kaptrain · Dec 7, 2016 · Viewed 9.1k times · Source

I would like to do multiple aggregations in Spark Structured Streaming.

Something like this:

  • Read a stream of input files (from a folder)
  • Perform aggregation 1 (with some transformations)
  • Perform aggregation 2 (and more transformations)

When I run this in Structured Streaming, it gives me an error "Multiple streaming aggregations are not supported with streaming DataFrames/Datasets".

Is there a way to do such multiple aggregations in Structured Streaming?

Answer

Mahesh Chand picture Mahesh Chand · Aug 4, 2017

This is not supported, but there are other ways also. Like performing single aggregation and saving it to kafka. Read it from kafka and apply aggregation again. This has worked for me.