Apache Spark SQL is a tool for "SQL and structured data processing" on Spark, a fast and general-purpose cluster computing system.
I would like to dynamically generate a dataframe containing a header record for a report, so creating a dataframe from …
apache-spark dataframe spark-dataframe rdd spark-csvI have a table with a array type column named writer which has the values like array[value1, value2], array[…
apache-spark apache-spark-sql spark-dataframe hiveql apache-spark-datasetI have a SparkR DataFrame as shown below: #Create R data.frame custId <- c(rep(1001, 5), rep(1002, 3), 1003) date <…
apache-spark pyspark spark-dataframe sparkrI'm processing events using Dataframes converted from a stream of JSON events which eventually gets written out as as Parquet …
apache-spark apache-spark-sql spark-streaming spark-dataframe parquetI want to write a DataFrame in Avro format using a provided Avro schema rather than Spark's auto-generated schema. How …
apache-spark spark-dataframe spark-avroUsing Spark 1.4.0, I am trying to insert data from a Spark DataFrame into a MemSQL database (which should be exactly …
mysql apache-spark spark-dataframe singlestoreI created a dataframe in spark when find the max date I want to save it to the variable. Just …
python dataframe spark-dataframe pyspark-sql databricksI'm using SparkSQL in a Java application to do some processing on CSV files using Databricks for parsing. The data …
java apache-spark apache-spark-sql spark-dataframe databricksI have a data file with three columns, and I want to normalize the last column to apply ALS with …
scala apache-spark spark-dataframe apache-spark-ml normalizeI have following DataFrame: |-----id-------|----value------|-----desc------| | 1 | v1 | d1 | | 1 | v2 | d2 | | 2 | v21 | d21 | | 2 | v22 | d22 | |--------------|---------------|---------------| I want …
scala apache-spark group-concat rdd spark-dataframe