Popular "apache-spark-sql" questions | Page 12

Coming from R, I am used to easily doing operations on columns. Is there any easy way to take this …

scala apache-spark dataframe apache-spark-sql user-defined-functions

What is the difference between DataFrame repartition() and DataFrameWriter partitionBy() methods? I hope both are used to "partition data based …

apache-spark-sql data-partitioning

I have a dataframe which has one row, and several columns. Some of the columns are single values, and others …

python apache-spark dataframe pyspark apache-spark-sql

SparkSQL CLI internally uses HiveQL and in case Hive on spark(HIVE-7292) , hive uses spark as backend engine. Can somebody …

apache-spark hadoop hive apache-spark-sql

Given Table 1 with one column "x" of type String. I want to create Table 2 with a column "y" that is …

scala apache-spark apache-spark-sql user-defined-functions nullable

I would like to calculate group quantiles on a Spark dataframe (using PySpark). Either an approximate or exact result would …

apache-spark pyspark apache-spark-sql pyspark-sql

How to bind variable in Apache Spark SQL? For example: val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc) …

scala apache-spark apache-spark-sql apache-spark-2.0

When CSV is read as dataframe in spark, all the columns are read as string. Is there any way to …

scala apache-spark apache-spark-sql spark-csv

I have this python code that runs locally in a pandas dataframe: df_result = pd.DataFrame(df .groupby('A') .apply(…

python apache-spark pyspark apache-spark-sql user-defined-functions

I'm currently trying to extract a database from MongoDB and use Spark to ingest into ElasticSearch with geo_points. The …

scala elasticsearch apache-spark etl apache-spark-sql

Top "Apache-spark-sql" questions