Delete blank rows from CSV?

debugged picture debugged · Dec 23, 2010 · Viewed 76.6k times · Source

I have a large csv file in which some rows are entirely blank. How do I use Python to delete all blank rows from the csv?

After all your suggestions, this is what I have so far

import csv

# open input csv for reading
inputCSV = open(r'C:\input.csv', 'rb')

# create output csv for writing
outputCSV = open(r'C:\OUTPUT.csv', 'wb')

# prepare output csv for appending
appendCSV = open(r'C:\OUTPUT.csv', 'ab')

# create reader object
cr = csv.reader(inputCSV, dialect = 'excel')

# create writer object
cw = csv.writer(outputCSV, dialect = 'excel')

# create writer object for append
ca = csv.writer(appendCSV, dialect = 'excel')

# add pre-defined fields
cw.writerow(['FIELD1_','FIELD2_','FIELD3_','FIELD4_'])

# delete existing field names in input CSV
# ???????????????????????????

# loop through input csv, check for blanks, and write all changes to append csv
for row in cr:
    if row or any(row) or any(field.strip() for field in row):
        ca.writerow(row)

# close files
inputCSV.close()
outputCSV.close()
appendCSV.close()

Is this ok or is there a better way to do this?

Answer

Laurence Gonsalves picture Laurence Gonsalves · Dec 23, 2010

Use the csv module:

import csv
...

with open(in_fnam) as in_file:
    with open(out_fnam, 'w') as out_file:
        writer = csv.writer(out_file)
        for row in csv.reader(in_file):
            if row:
                writer.writerow(row)

If you also need to remove rows where all of the fields are empty, change the if row: line to:

if any(row):

And if you also want to treat fields that consist of only whitespace as empty you can replace it with:

if any(field.strip() for field in row):

Note that in Python 2.x and earlier, the csv module expected binary files, and so you'd need to open your files with e 'b' flag. In 3.x, doing this will result in an error.