How to change the field separator of a file using Python?

Pierre picture Pierre · May 18, 2011 · Viewed 15.4k times · Source

I'm new to Python from the R world, and I'm working on big text files, structured in data columns (this is LiDaR data, so generally 60 million + records).

Is it possible to change the field separator (eg from tab-delimited to comma-delimited) of such a big file without having to read the file and do a for loop on the lines?

Answer

Eli Bendersky picture Eli Bendersky · May 18, 2011

No.

  • Read the file in
  • Change separators for each line
  • Write each line back

This is easily doable with just a few lines of Python (not tested but the general approach works):

# Python - it's so readable, the code basically just writes itself ;-)
#
with open('infile') as infile:
  with open('outfile', 'w') as outfile:
    for line in infile:
      fields = line.split('\t')
      outfile.write(','.join(fields))

I'm not familiar with R, but if it has a library function for this it's probably doing exactly the same thing.

Note that this code only reads one line at a time from the file, so the file can be larger than the physical RAM - it's never wholly loaded in.