Top "Spark-csv" questions

A library for handling CSV files in Apache Spark.

Decimal data type not storing the values correctly in both spark and Hive

I am having a problem storing with the decimal data type and not sure if it is a bug or …

apache-spark hive apache-spark-sql spark-csv
Add UUID to spark dataset

I am trying to add a UUID column to my dataset. getDataset(Transaction.class)).withColumn("uniqueId", functions.lit(UUID.randomUUID().…

apache-spark apache-spark-dataset spark-csv
Can I read a CSV represented as a string into Apache Spark using spark-csv

I know how to read a csv file into spark using spark-csv (https://github.com/databricks/spark-csv), but I already …

apache-spark apache-spark-sql spark-csv
How to save CSV with all fields quoted?

The below code does not add the double quotes which is the default. I also tried adding # and single quote …

scala apache-spark spark-csv
Programmatically generate the schema AND the data for a dataframe in Apache Spark

I would like to dynamically generate a dataframe containing a header record for a report, so creating a dataframe from …

apache-spark dataframe spark-dataframe rdd spark-csv
Spark DataFrame handing empty String in OneHotEncoder

I am importing a CSV file (using spark-csv) into a DataFrame which has empty String values. When applied the OneHotEncoder, …

scala apache-spark apache-spark-mllib apache-spark-ml spark-csv
Spark CSV package not able to handle \n within fields

I have a CSV file which I am trying to load using Spark CSV package and it does not load …

scala apache-spark apache-spark-sql spark-csv apache-spark-1.6
Why is difference between sqlContext.read.load and sqlContext.read.text?

I am only trying to read a textfile into a pyspark RDD, and I am noticing huge differences between sqlContext.…

apache-spark pyspark apache-spark-sql spark-csv