Popular "apache-spark" questions | Page 6

Looking at the new spark dataframe api, it is unclear whether it is possible to modify dataframe columns. How would …

python apache-spark pyspark apache-spark-sql spark-dataframe

I'm using spark 1.4.0-rc2 so I can use python 3 with spark. If I add export PYSPARK_PYTHON=python3 to my .…

apache-spark pyspark

I want to select a column that equals to a certain value. I am doing this in scala and having …

scala apache-spark dataframe apache-spark-sql

I'm just wondering what is the difference between an RDD and DataFrame (Spark 2.0.0 DataFrame is a mere type alias for …

dataframe apache-spark apache-spark-sql rdd apache-spark-dataset

What's the difference between an RDD's map and mapPartitions method? And does flatMap behave like map or like mapPartitions? Thanks. (…

performance scala apache-spark rdd

I've seen various people suggesting that Dataframe.explode is a useful way to do this, but it results in more …

apache-spark pyspark apache-spark-sql spark-dataframe pyspark-sql

True ... it has been discussed quite a lot. However there is a lot of ambiguity and some of the answers …

java scala apache-spark jar spark-submit

I installed Spark using the AWS EC2 guide and I can launch the program fine using the bin/pyspark script …

python scala apache-spark hadoop pyspark

I would like to modify the cell values of a dataframe column (Age) where currently it is blank and I …

python apache-spark dataframe pyspark apache-spark-sql

I'm not able to run a simple spark job in Scala IDE (Maven spark project) installed on Windows 7 Spark core …

eclipse scala apache-spark

Top "Apache-spark" questions