Apache Spark SQL is a tool for "SQL and structured data processing" on Spark, a fast and general-purpose cluster computing system.
I'm getting an error while trying to run the following code: import org.apache.spark.sql.Dataset; import org.apache.…
java apache-spark hive apache-spark-sqlI would like to include null values in an Apache Spark join. Spark doesn't include rows with null by default. …
sql scala apache-spark join apache-spark-sqlI have a Spark data frame where one column is an array of integers. The column is nullable because it …
apache-spark dataframe apache-spark-sql apache-spark-1.5I have a Spark 2.0 dataframe example with the following structure: id, hour, count id1, 0, 12 id1, 1, 55 .. id1, 23, 44 id2, 0, 12 id2, 1, 89 .. id2, 23, 34 etc. …
scala apache-spark apache-spark-sql spark-streaming spark-dataframeI have read an avro file into spark RDD and need to conver that into a sql dataframe. how do …
scala apache-spark apache-spark-sql apache-zeppelinConsider the code given here, https://spark.apache.org/docs/1.2.0/ml-guide.html import org.apache.spark.ml.classification.LogisticRegression val …
scala apache-spark pyspark apache-spark-sql apache-spark-mlI have a udf which returns a list of strings. this should not be too hard. I pass in the …
python apache-spark pyspark apache-spark-sql user-defined-functionsContext: I have a DataFrame with 2 columns: word and vector. Where the column type of "vector" is VectorUDT. An Example: …
python apache-spark pyspark apache-spark-sql apache-spark-mlI have an embarrassingly parallel task for which I use Spark to distribute the computations. These computations are in Python, …
python apache-spark hbase pyspark apache-spark-sqlSimilar question as here, but don't have enough points to comment there. According to the latest Spark documentation an udf …
java apache-spark apache-spark-sql user-defined-functions