Popular "apache-spark-sql" questions | Page 16

Short version of the question! Consider the following snippet (assuming spark is already set to some SparkSession): from pyspark.sql …

python apache-spark pyspark apache-spark-sql apache-spark-ml

I'm using pyspark, loading a large csv file into a dataframe with spark-csv, and as a pre-processing step I need …

python apache-spark pyspark apache-spark-sql user-defined-functions

Spark now offers predefined functions that can be used in dataframes, and it seems they are highly optimized. My original …

performance apache-spark pyspark apache-spark-sql user-defined-functions

I made a simple UDF to convert or extract some values from a time field in a temptabl in spark. …

scala apache-spark apache-spark-sql apache-zeppelin

I'm having trouble finding a library that allows Parquet files to be written using Python. Bonus points if I can …

python apache-spark apache-spark-sql parquet snappy

I have a dataframe in Spark in which one of the columns contains an array.Now,I have written a …

arrays apache-spark pyspark apache-spark-sql user-defined-functions

I have data in a parquet file which has 2 fields: object_id: String and alpha: Map<>. It is …

scala apache-spark dataframe apache-spark-sql apache-spark-dataset

When running sparkJob on a cluster past a certain data size(~2,5gb) I am getting either "Job cancelled because SparkContext …

scala apache-spark yarn apache-spark-sql

I am using Spark SQL (I mention that it is in Spark in case that affects the SQL syntax - …

sql apache-spark apache-spark-sql hiveql

I'm trying to read the messages from kafka (version 10) in spark and trying to print it. import spark.implicits._ val …

scala apache-spark-sql spark-streaming

Top "Apache-spark-sql" questions