Spark Streaming is an extension of the core Apache Spark API that enables high-throughput, fault-tolerant stream processing of live data streams.
We have been using spark streaming with kafka for a while and until now we were using the createStream method …
apache-spark apache-kafka spark-streamingGetting Below exception , when i tried to perform unit tests for my spark streaming code on SBT windows using scalatest. …
scala apache-spark sbt spark-streaming scalatestI have some use cases that I would like to be more clarified, about Kafka topic partitioning -> spark …
apache-spark apache-kafka spark-streamingI am using Spark in Horton works, when i execute the below code i am getting exception. i also have …
apache-spark apache-spark-sql spark-streaming hortonworks-data-platform hortonworks-sandboxI am trying to pass data from kafka to spark streaming. This is what I've done till now: Installed both …
apache-spark apache-kafka spark-streaming kafka-pythonI know there are many threads already on 'spark streaming connection refused' issues. But most of these are in Linux …
scala apache-spark spark-streamingIn Spark Streaming it is possible (and mandatory if you're going to use stateful operations) to set the StreamingContext to …
apache-spark spark-streaming checkpointingI am trying to write JavaPairRDD into file in local system. Code below: JavaPairDStream<String, Integer> wordCounts = words.…
apache-spark streaming pyspark spark-streaming hadoop-streamingI'm processing events using Dataframes converted from a stream of JSON events which eventually gets written out as as Parquet …
apache-spark apache-spark-sql spark-streaming spark-dataframe parquetTo make it clear, I am not looking for RDD from an array/list like List<Integer> list = …
apache-spark spark-streaming