Top "Spark-streaming" questions

Spark Streaming is an extension of the core Apache Spark API that enables high-throughput, fault-tolerant stream processing of live data streams.

Use schema to convert AVRO messages with Spark to DataFrame

Is there a way to use a schema to convert avro messages from kafka with spark to dataframe? The schema …

scala apache-spark apache-kafka spark-streaming avro
You need to build Spark before running this program error when running bin/pyspark

I am getting started with Spark. I am getting an issue when starting spark. I downloaded from spark official website, …

apache-spark apache-spark-sql pyspark spark-streaming spark-view-engine
SQL over Spark Streaming

This is the code to run simple SQL queries over Spark Streaming. import org.apache.spark.streaming.{Seconds, StreamingContext} import …

apache-spark spark-streaming
how to properly use pyspark to send data to kafka broker?

I'm trying to write a simple pyspark job, which would receive data from a kafka broker topic, did some transformation …

python-2.7 pyspark spark-streaming kafka-python
What is the correct way to start/stop spark streaming jobs in yarn?

I have been experimenting and googling for many hours, with no luck. I have a spark streaming app that runs …

hadoop apache-spark spark-streaming yarn cloudera
Kafkaconsumer is not safe for multi-threading access

I am using below code to read from Kafka topic , and process the data. JavaDStream<Row> transformedMessages = messages.…

spark-streaming
How to stop spark streaming when the data source has run out

I have a spark streaming job that read from Kafka every 5 seconds, does some transformation on incoming data, and then …

python apache-spark apache-kafka pyspark spark-streaming
Spark Streaming from Kafka has error numRecords must not be negative

Its kind of strange error because I still push data to kafka and consume message from kafka and Exception in …

apache-spark apache-kafka spark-streaming
How to Stop running Spark Streaming application Gracefully?

How Do i stop spark streaming? My spark streaming job is running continuously. I want to stop in a graceful …

apache-spark spark-streaming
Back pressure in Kafka

I have a situation in Kafka where the producer publishes the messages at a very higher rate than the consumer …

apache-kafka spark-streaming backpressure