Top "Spark-streaming" questions

Spark Streaming is an extension of the core Apache Spark API that enables high-throughput, fault-tolerant stream processing of live data streams.

Drop spark dataframe from cache

I am using Spark 1.3.0 with python api. While transforming huge dataframes, I cache many DFs for faster execution; df1.cache() …

apache-spark apache-spark-sql spark-streaming
The value of "spark.yarn.executor.memoryOverhead" setting?

The value of spark.yarn.executor.memoryOverhead in a Spark job with YARN should be allocated to App or just …

apache-spark apache-spark-sql spark-streaming apache-spark-mllib
How to write spark streaming DF to Kafka topic

I am using Spark Streaming to process data between two Kafka queues but I can not seem to find a …

scala apache-spark apache-kafka spark-streaming spark-streaming-kafka
build.sbt: how to add spark dependencies

Hello I am trying to download spark-core, spark-streaming, twitter4j, and spark-streaming-twitter in the build.sbt file below: name := "hello" …

scala apache-spark sbt spark-streaming
Spark using python: How to resolve Stage x contains a task of very large size (xxx KB). The maximum recommended task size is 100 KB

I've just created python list of range(1,100000). Using SparkContext done the following steps: a = sc.parallelize([i for i in …

apache-spark spark-streaming
IBM MQ versus Apache Kafka

I am designing a new architecture big data where my client has as IBM MQ broker. We use to work …

ibm-mq apache-kafka apache-storm spark-streaming
Condition in map function

Is there anything in Scala like, condition ? first_expression : second_expression; that I can use within map function in scala? …

scala apache-spark spark-streaming map-function
Difference in Used, Committed and Max Heap Memory

I am monitoring a spark executor JVM of a OutOfMemoryException. I used Jconsole to connect to executor JVM. Following is …

java apache-spark memory-management jvm spark-streaming
Spark Driver Memory and Executor Memory

I am beginner to Spark and I am running my application to read 14KB data from text filed, do some …

java apache-spark spark-streaming spark-submit
java.lang.NoClassDefFoundError: org/apache/spark/streaming/twitter/TwitterUtils$ while running TwitterPopularTags

I am a beginner in Spark streaming and Scala. For a project requirement I was trying to run TwitterPopularTags example …

scala maven apache-spark noclassdeffounderror spark-streaming