Turning Delta Tables into CSV for Download: Your Ultimate How-To Guide!

Converting Delta Tables to CSV

Kuharan Bhowmik
2 min readAug 25, 2023

--

Photo by Rod Long on Unsplash

Hey there, tech aficionados!

Today, we’re diving into some real nifty data sorcery using data bricks. Imagine you’ve got this data sitting in a Delta table, and you’re itching to transform it into something more universal, like CSV files.

Well, guess what? Databricks swoops in like a superhero with a user-friendly way to make this happen.

So, buckle up as we embark on a journey to wrangle data, sprinkle some tech magic, and end up with CSVs that everyone can understand. Ready? Let’s roll!

Databricks hooks you up with an easy way to deal with Delta tables, but hey, you can totally tweak this method for other setups too. Anyways, this is the game plan:

Save the Delta Table as CSV.

Detect the Part Files, rename and move them.

Chuck Out the Unwanted Files.

KISS — Keep it super simple, right? 😉

Save the Delta Table as CSV

df = spark.table(“schema_name.table_name”)

(df.coalesce(1)
.write
.mode("overwrite")
.format("csv")
.option("header", "true")
.option("sep", ",")
.save("/mnt/temp_path/filename.csv"))

This was supposed to be a smooth save, but guess what? It ended up crafting a folder named “filename.csv.” Once you peek inside, you’ll find a part file, a success file, and a bunch of other log files.

So, change this to:

.save("/mnt/temp_path")

Our next move involves hunting down that data-carrying file, kicking off with the title “part-”. Don’t sweat it — we’ve already cooked up a solo partition file using a coalesce(1) trick.

Detect the Part Files

list_of_files = dbutils.fs.ls("/mnt/temp_path")
for item in list_of_files:
if item.name.startswith("part"):
csv_part_file = item.path

Now that we’ve spotted the part file, it’s time to jazz it up by moving and giving it a snazzy CSV name, while clearing out the rest.

Rename and move them and chuck Out the Unwanted Files

dbutils.fs.mv(csv_part_file, "/mnt/final_path", True)
dbutils.fs.rm("/mnt/temp_path", True)

And there you have it! With these steps, you’ve successfully transformed a Delta table into a polished CSV, neatly organized within the confines of Azure Data Lake Storage Gen2.

Happy coding!

--

--