Popular "apache-spark-sql" questions | Page 14

I have a very large dataset that is loaded in Hive. It consists of about 1.9 million rows and 1450 columns. I …

python apache-spark dataframe pyspark apache-spark-sql

What are the differences between Apache Spark SQLContext and HiveContext ? Some sources say that since the HiveContext is a superset …

apache-spark hive apache-spark-sql

I have a DataFrame generated as follows: df.groupBy($"Hour", $"Category") .agg(sum($"value").alias("TotalValue")) .sort($"Hour".asc,$"TotalValue".…

scala apache-spark apache-spark-sql spark-dataframe parquet

As a simplified example, I have a dataframe "df" with columns "col1,col2" and I want to compute a row-wise …

python apache-spark pyspark apache-spark-sql

I am trying to use the Spark Dataset API but I am having some issues doing a simple join. Let's …

scala apache-spark apache-spark-sql apache-spark-dataset

I'm wondering how I can achieve the following in Spark (Pyspark) Initial Dataframe: +--+---+ |id|num| +--+---+ |4 |9.0| +--+…

python apache-spark dataframe pyspark apache-spark-sql

I want to parse the date columns in a DataFrame, and for each date column, the resolution for the date …

scala apache-spark apache-spark-sql user-defined-functions

I am new to Spark SQL. We are migrating data from SQL server to Databricks. I am using SPARK SQL . …

apache-spark-sql datediff databricks

I am trying to test how to write data in HDFS 2.7 using Spark 2.1. My data is a simple sequence of …

scala apache-spark apache-spark-sql parquet

I am trying to run random forest classification by using Spark ML api but I am having issues with creating …

scala apache-spark apache-spark-sql apache-spark-mllib

Top "Apache-spark-sql" questions