Batch conversion of .dbf to .csv in Python

5tanczak picture 5tanczak · Aug 26, 2013 · Viewed 9.6k times · Source

I have ~300 folders with .dbf files that I would like to convert to .csv files.

I am using os.walk to find all the .dbf files and then a for loop utilizing the dbfpy module to convert each .dbf file to a .csv. It seems to be finding and reading the .dbf files correctly but not converting them to .csv. I believe the csv.writer code is the issue. I am not receiving any errors but the files stay as .dbf.

My code below is based on code found here.

import csv
from dbfpy import dbf
import os

path = r"\Documents\House\DBF"


for dirpath, dirnames, filenames in os.walk(path):
    for filename in filenames:
        if filename.endswith('.DBF'):
            in_db = dbf.Dbf(os.path.join(dirpath, filename))
            csv_fn = filename[:-4]+ ".csv"
            out_csv = csv.writer(open(csv_fn,'wb'))

        names = []
        for field in in_db.header.fields:
            names.append(field.name)
        out_csv.writerow(names)


        for rec in in_db:
            out_csv.writerow(rec.fieldData)

        in_db.close()

Answer

munk picture munk · Aug 27, 2013

The original file you have will stay as a dbf. You're not actually replacing it, but instead creating a new csv file. I think the problem is that the write to disk never happens. I suspect the csv writer isn't flushing the file buffer.

Another problem I see is that out_csv is created conditionally, so if you have some other file in that directory with a different extension you'll run into problems.

Try using a context manager:

for dirpath, dirnames, filenames in os.walk(path):
    for filename in filenames:
        if filename.endswith('.DBF'):
            csv_fn = filename[:-4]+ ".csv"
            with open(csv_fn,'wb') as csvfile:
                in_db = dbf.Dbf(os.path.join(dirpath, filename))
                out_csv = csv.writer(csvfile)
                names = []
                for field in in_db.header.fields:
                    names.append(field.name)
                out_csv.writerow(names)
                for rec in in_db:
                    out_csv.writerow(rec.fieldData)
                in_db.close()

The 'with' statement (the context manager) will close the file and flush the buffer at the end, without you needing to do that explicitly.