Popular "spark-dataframe" questions | Page 6

I am using pyspark 2.0 to create a DataFrame object by reading a csv using: data = spark.read.csv('data.csv', …

apache-spark spark-dataframe apache-spark-2.0

I am trying to create a DataFrame using RDD. First I am creating a RDD using below code - val …

scala apache-spark spark-dataframe apache-spark-dataset

I'm working on a spark mllib algorithm. The dataset I have is in this form Company":"XXXX","CurrentTitle":"XYZ","Edu_…

apache-spark apache-spark-sql spark-dataframe apache-spark-mllib

How do you replace single quotes with double quotes in Scala? I have a data file that has some records …

scala dataframe spark-dataframe double-quotes single-quotes

Given the following DataSet values as inputData: column0 column1 column2 column3 A 88 text 99 Z 12 test 200 T 120 foo 12 In Spark, what …

scala apache-spark spark-dataframe apache-spark-dataset

Commmunity! Please help me understand how to get better compression ratio with Spark? Let me describe case: I have dataset, …

apache-spark apache-spark-sql spark-dataframe parquet snappy

I am saving my spark data frame output as csv file in scala with partitions. This is how i do …

scala apache-spark amazon-s3 spark-dataframe multipleoutputs

I have a Spark DataFrame as shown below: #Create DataFrame df <- data.frame(name = c("Thomas", "William", "Bill", "…

pyspark spark-dataframe sparkr

I thought that with the integration of project Tungesten, spark would automatically use off heap memory. What for are spark.…

apache-spark apache-spark-sql spark-dataframe apache-spark-2.0 off-heap

I created a dataframe using sqlContext and I have a problem with the datetime format as it is identified as …

datetime apache-spark pyspark spark-dataframe python-datetime

Top "Spark-dataframe" questions