Top "Apache-spark-2.0" questions

Use for questions specific to Apache Spark 2.0. For general questions related to Apache Spark use the tag [apache-spark].

Reading Json file using Apache Spark

I am trying to read Json file using Spark v2.0.0. In case of simple data code works really well. In …

java json hadoop apache-spark apache-spark-2.0
Parsing json in spark

I was using json scala library to parse a json from a local drive in spark job : val requestJson=JSON.…

scala apache-spark apache-spark-sql apache-spark-2.0
How to cast a WrappedArray[WrappedArray[Float]] to Array[Array[Float]] in spark (scala)

Im using Spark 2.0. I have a column of my dataframe containing a WrappedArray of WrappedArrays of Float. An example of …

arrays scala casting spark-dataframe apache-spark-2.0
adding two columns from a data frame in scala

I have two columns age and salary stored in DF. I just want to write a scala code to add …

scala apache-spark apache-spark-sql apache-spark-2.0
How to create encoder for custom Java objects?

I am using following class to create bean from Spark Encoders Class OuterClass implements Serializable { int id; ArrayList<InnerClass&…

java apache-spark apache-spark-2.0
pyspark error: 'DataFrame' object has no attribute 'map'

I am using pyspark 2.0 to create a DataFrame object by reading a csv using: data = spark.read.csv('data.csv', …

apache-spark spark-dataframe apache-spark-2.0
How to map struct in DataFrame to case class?

At some point in my application, I have a DataFrame with a Struct field created from a case class. Now …

scala apache-spark dataframe apache-spark-sql apache-spark-2.0
Pass system property to spark-submit and read file from classpath or custom path

I have recently found a way to use logback instead of log4j in Apache Spark (both for local use …

java scala apache-spark apache-spark-2.0 spark-submit
Spark 2.0 memory fraction

I am working with Spark 2.0, the job starts by sorting the input data and storing its output on HDFS. I …

memory apache-spark out-of-memory distributed-computing apache-spark-2.0
Reading Avro messages from Kafka with Spark 2.0.2 (structured streaming)

I have a spark 2.0 application that reads messages from kafka using spark streaming (with spark-streaming-kafka-0-10_2.11). Structured streaming looks really …

scala apache-kafka spark-streaming avro apache-spark-2.0