Top "Databricks" questions

For questions about the Databricks Unified Analytics Platform

Remove Files from Directory after uploading in Databricks using dbutils

A very clever person from StackOverflow assisted me in copying files to a directory from Databricks here: copyfiles I am …

python databricks azure-databricks
Databricks dbutils.fs.ls shows files. However, reading them throws an IO error

I am running a Spark Cluster and when I'm executing the below command on Databricks Notebook, it gives me the …

pyspark databricks
How to drop a column from a Databricks Delta table?

I have recently started discovering Databricks and faced a situation where I need to drop a certain column of a …

sql apache-spark apache-spark-sql databricks delta-lake
Exporting spark dataframe to .csv with header and specific filename

I am trying to export data from a spark dataframe to .csv file: df.coalesce(1)\ .write\ .format("com.databricks.spark.…

python apache-spark pyspark export-to-csv databricks
Databricks - failing to write from a DataFrame to a Delta location

I wanted to change a column name of a Databricks Delta table. So I did the following: // Read old table …

scala apache-spark databricks delta-lake
Databricks display() function equivalent or alternative to Jupyter

I'm in the process of migrating current DataBricks Spark notebooks to Jupyter notebooks, DataBricks provides convenient and beautiful display(data_…

apache-spark jupyter-notebook databricks
How can I convert a pyspark.sql.dataframe.DataFrame back to a sql table in databricks notebook

I created a dataframe of type pyspark.sql.dataframe.DataFrame by executing the following line: dataframe = sqlContext.sql("select * from …

python sql apache-spark pyspark databricks
Ways to Plot Spark Dataframe without Converting it to Pandas

Is there any way to plot information from Spark dataframe without converting the dataframe to pandas? Did some online research …

python pandas pyspark databricks
Azure Databricks - Can not create the managed table The associated location already exists

I have the following problem in Azure Databricks. Sometimes when I try to save a DataFrame as a managed table: …

apache-spark hive azure-data-lake databricks azure-databricks
Saving a dataframe result value to a string variable?

I created a dataframe in spark when find the max date I want to save it to the variable. Just …

python dataframe spark-dataframe pyspark-sql databricks