Top "Apache-spark-dataset" questions

Spark Dataset is a strongly typed collection of objects mapped to a relational schema.

Difference between DataFrame, Dataset, and RDD in Spark

I'm just wondering what is the difference between an RDD and DataFrame (Spark 2.0.0 DataFrame is a mere type alias for …

dataframe apache-spark apache-spark-sql rdd apache-spark-dataset
How to store custom objects in Dataset?

According to Introducing Spark Datasets: As we look forward to Spark 2.0, we plan some exciting improvements to Datasets, specifically: ... Custom …

scala apache-spark apache-spark-dataset apache-spark-encoders
Why is "Unable to find encoder for type stored in a Dataset" when creating a dataset of custom case class?

Spark 2.0 (final) with Scala 2.11.8. The following super simple code yields the compilation error Error:(17, 45) Unable to find encoder for type …

scala apache-spark apache-spark-dataset apache-spark-encoders
How to change case of whole column to lowercase?

I want to Change case of whole column to Lowercase in Spark Dataset Desired Input +------+--------------------+ |ItemID| Category name| +…

java apache-spark apache-spark-sql apache-spark-dataset
How to convert the datasets of Spark Row into string?

I have written the code to access the Hive table using SparkSQL. Here is the code: SparkSession spark = SparkSession .builder() .…

java string apache-spark apache-spark-sql apache-spark-dataset
Printschema() in Apache Spark

Dataset<Tweet> ds = sc.read().json("/path").as(Encoders.bean(Tweet.class)); Tweet class :- long id string …

apache-spark spark-dataframe apache-spark-dataset
Encoder error while trying to map dataframe row to updated row

When I m trying to do the same thing in my code as mentioned below dataframe.map(row => { val …

scala apache-spark apache-spark-sql apache-spark-dataset apache-spark-encoders
Difference between DataSet API and DataFrame API

I'm just wondering what is the difference between an RDD and DataFrame (Spark 2.0.0 DataFrame is a mere type alias for …

apache-spark apache-spark-sql rdd apache-spark-dataset
Spark Dataset API - join

I am trying to use the Spark Dataset API but I am having some issues doing a simple join. Let's …

scala apache-spark apache-spark-sql apache-spark-dataset
spark createOrReplaceTempView vs createGlobalTempView

Spark Dataset 2.0 provides two functions createOrReplaceTempView and createGlobalTempView. I am not able to understand the basic difference between both functions. …

apache-spark apache-spark-dataset