Top "Spark-structured-streaming" questions

Spark Structured Streaming allows processing live data streams using DataFrame and Dataset APIs.

Why does Spark application fail with “ClassNotFoundException: Failed to find data source: kafka” as uber-jar with sbt assembly?

I'm trying to run a sample like StructuredKafkaWordCount. I started with the Spark Structured Streaming Programming guide. My code is …

scala apache-spark sbt sbt-assembly spark-structured-streaming
Why does format("kafka") fail with "Failed to find data source: kafka." (even with uber-jar)?

I use HDP-2.6.3.0 with Spark2 package 2.2.0. I'm trying to write a Kafka consumer, using the Structured Streaming API, but I'm …

apache-spark apache-spark-sql spark-structured-streaming uberjar
Spark Strutured Streaming automatically converts timestamp to local time

I have my timestamp in UTC and ISO8601, but using Structured Streaming, it gets automatically converted into the local time. …

java scala apache-spark apache-spark-sql spark-structured-streaming
How to use from_json with schema as string (i.e. a JSON-encoded schema)?

I'm reading a stream from Kafka, and I convert the value from Kafka ( which is JSON ) in to Structure. from_…

apache-spark apache-spark-sql spark-structured-streaming
Spark structured streaming kafka convert JSON without schema (infer schema)

I read Spark Structured Streaming doesn't support schema inference for reading Kafka messages as JSON. Is there a way to …

apache-spark apache-kafka schema spark-structured-streaming
Integrating Spark Structured Streaming with the Confluent Schema Registry

I'm using a Kafka Source in Spark Structured Streaming to receive Confluent encoded Avro records. I intend to use Confluent …

apache-spark apache-kafka avro confluent-schema-registry spark-structured-streaming
How to get Kafka offsets for structured query for manual and reliable offset management?

Spark 2.2 introduced a Kafka's structured streaming source. As I understand, it's relying on HDFS checkpoint directory to store offsets and …

apache-spark apache-kafka apache-spark-sql offset spark-structured-streaming
Multiple aggregations in Spark Structured Streaming

I would like to do multiple aggregations in Spark Structured Streaming. Something like this: Read a stream of input files (…

apache-spark apache-spark-sql spark-structured-streaming
How to get the output from console streaming sink in Zeppelin?

I'm struggling to get the console sink working with PySpark Structured Streaming when run from Zeppelin. Basically, I'm not seeing …

apache-spark pyspark apache-zeppelin spark-structured-streaming
How to display a streaming DataFrame (as show fails with AnalysisException)?

So I have some data I'm stream in a Kafka topic, I'm taking this streaming data and placing it into …

apache-spark pyspark apache-kafka spark-structured-streaming