Popular "rdd" questions | Page 2

I know how to find the file size in scala.But how to find a RDD/dataframe size in spark? …

scala apache-spark rdd

How can I find median of an RDD of integers using a distributed method, IPython, and Spark? The RDD is …

python apache-spark median rdd pyspark

When a resilient distributed dataset (RDD) is created from a text file or collection (or from another RDD), do we …

scala apache-spark rdd

In my pig code I do this: all_combined = Union relation1, relation2, relation3, relation4, relation5, relation 6. I want to do …

python apache-spark pyspark rdd

The below code will read from the hbase, then convert it to json structure and the convert to schemaRDD , But …

hbase apache-spark rdd

I have a RDD and I want to convert it to pandas dataframe. I know that to convert and RDD …

python pandas ipython pyspark rdd

I need to join two ordinary RDDs on one/more columns. Logically this operation is equivalent to the database join …

scala join apache-spark rdd apache-spark-sql

I am dealing with transforming SQL code to PySpark code and came across some SQL statements. I don't know how …

apache-spark pyspark spark-dataframe rdd pyspark-sql

I know the method rdd.firstwfirst() which gives me the first element in an RDD. Also there is the method …

java apache-spark rdd

I'm trying to load an SVM file and convert it to a DataFrame so I can use the ML module (…

python apache-spark pyspark apache-spark-sql rdd

Top "Rdd" questions