How to convert Parquet to CSV from a local file system (e.g. python, some library etc.) but WITHOUT Spark? (trying to find as simple and minimalistic solution as possible because need to automate everything and not much resources).
I tried with e.g. parquet-tools
on my Mac but data output did not look correct.
Need to make output so that when data is not present in some columns - CSV will have corresponding NULL (empty column between 2 commas)..
Thanks.
You can do this by using the Python packages pandas
and pyarrow
(pyarrow
is an optional dependency of pandas
that you need for this feature).
import pandas as pd
df = pd.read_parquet('filename.parquet')
df.to_csv('filename.csv')
When you need to make modifications to the contents in the file, you can standard pandas
operations on df
.