Popular "apache-spark-2.0" questions

I looked at the docs and it says the following join types are supported: Type of join to perform. Default …

scala apache-spark apache-spark-sql spark-dataframe apache-spark-2.0

I am reading a csv file in Pyspark as follows: df_raw=spark.read.option("header","true").csv(csv_path) …

csv apache-spark pyspark apache-spark-sql apache-spark-2.0

I am trying to leverage spark partitioning. I was trying to do something like data.write.partitionBy("key").parquet("/location") …

apache-spark spark-dataframe rdd apache-spark-2.0 bigdata

I have a Spark application which using Spark 2.0 new API with SparkSession. I am building this application on top of …

scala apache-spark apache-spark-2.0

How to bind variable in Apache Spark SQL? For example: val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc) …

scala apache-spark apache-spark-sql apache-spark-2.0

I am running a Bash Script in MAC. This script calls a spark method written in Scala language for a …

scala apache-spark spark-graphx apache-spark-2.0

I am trying to traverse a Dataset to do some string similarity calculations like Jaro winkler or Cosine Similarity. I …

java apache-spark iterator apache-spark-2.0 apache-spark-dataset

What are the improvements Apache Spark2 brings compared to Apache Spark? From architecture perspective From application point of view or …

apache-spark apache-spark-2.0

I have a dataframe and I want to add for each row new_col=max(some_column0) grouped by some …

pyspark spark-dataframe apache-spark-2.0

I have a request to use rdd to do so： val test = Seq(("New York", "Jack"), ("Los Angeles", "Tom"), ("Chicago", "…

apache-spark dataset apache-spark-2.0

Top "Apache-spark-2.0" questions