Top "Apache-spark-1.4" questions

Use for questions specific to Apache Spark 1.4. For general questions related to Apache Spark use the tag [apache-spark].

DataFrame join optimization - Broadcast Hash Join

I am trying to effectively join two DataFrames, one of which is large and the second is a bit smaller. …

apache-spark dataframe apache-spark-sql apache-spark-1.4
How to optimize shuffle spill in Apache Spark application

I am running a Spark streaming application with 2 workers. Application has a join and an union operations. All the batches …

apache-spark spark-streaming apache-spark-1.4
How to start a Spark Shell using pyspark in Windows?

I am a beginner in Spark and trying to follow instructions from here on how to initialize Spark shell from …

pyspark apache-spark-1.4
Cannot start spark-shell

I am using Spark 1.4.1. I can use spark-submit without problem. But when I ran ~/spark/bin/spark-shell I got the …

apache-spark apache-spark-1.4
Find size of data stored in rdd from a text file in apache spark

I am new to Apache Spark (version 1.4.1). I wrote a small code to read a text file and stored its …

scala apache-spark apache-spark-1.4