Pandas to_csv() checking for overwrite

Robin Kramer picture Robin Kramer · Nov 2, 2016 · Viewed 41.9k times · Source

When I am analyzing data, I save my dataframes into a csv-file and use pd.to_csv() for that. However, the function (over)writes the new file, without checking whether there exists one with the same name. Is there a way to check whether the file already exists, and if so, ask for a new filename?

I know I can add the system's datetime to the filename, which will prevent any overwriting, but I would like to know when I made the mistake.

Answer

tda picture tda · Nov 2, 2016

Try the following:

import glob
import pandas as pd

# Give the filename you wish to save the file to
filename = 'Your_filename.csv'

# Use this function to search for any files which match your filename
files_present = glob.glob(filename)


# if no matching files, write to csv, if there are matching files, print statement
if not files_present:
    pd.to_csv(filename)
else:
    print 'WARNING: This file already exists!' 

I have not tested this but it has been lifted and compiled from some previous code which I have written. This will simply STOP files overwriting others. N.B. you will have to change the filename variable yourself to then save the file, or use some datetime variable as you suggested. I hope this helps in some way.