CSV in Python adding an extra carriage return, on Windows

apalopohapa picture apalopohapa · Jul 7, 2010 · Viewed 182.7k times · Source
import csv
outfile = file('test.csv', 'w')
writer = csv.writer(outfile, delimiter=',', quoting=csv.QUOTE_MINIMAL)
writer.writerow(['hi','dude'])
writer.writerow(['hi2','dude2'])
outfile.close()

It generates a file, test.csv, with an extra \r at each row, like so:

test.csv

hi,dude\r\r\nhi2,dude2\r\r\n

instead of the expected:

hi,dude\r\nhi2,dude2\r\n

Why is this happening, or is this actually the desired behavior?

Note:

  • This behavior can occur with Python 2 or 3.

Answer

John Machin picture John Machin · Jul 7, 2010

Python 3:

The official csv documentation recommends opening the file with newline='' on all platforms to disable universal newlines translation:

with open('output.csv', 'w', newline='', encoding='utf-8') as f:
    writer = csv.writer(f)
    ...

The CSV writer terminates each line with the lineterminator of the dialect, which is \r\n for the default excel dialect on all platforms.


Python 2:

On Windows, always open your files in binary mode ("rb" or "wb"), before passing them to csv.reader or csv.writer.

Although the file is a text file, CSV is regarded a binary format by the libraries involved, with \r\n separating records. If that separator is written in text mode, the Python runtime replaces the \n with \r\n, hence the \r\r\n observed in the file.

See this previous answer.