How can I remove carriage return from a text file with Python?

mrcoulson picture mrcoulson · Jul 15, 2013 · Viewed 34.1k times · Source

The things I've googled haven't worked, so I'm turning to experts!

I have some text in a tab-delimited text file that has some sort of carriage return in it (when I open it in Notepad++ and use "show all characters", I see [CR][LF] at the end of the line). I need to remove this carriage return (or whatever it is), but I can't seem to figure it out. Here's a snippet of the text file showing a line with the carriage return:

firstcolumn secondcolumn    third   fourth  fifth   sixth       seventh
moreoftheseventh        8th             9th 10th    11th    12th                    13th

Here's the code I'm trying to use to replace it, but it's not finding the return:

with open(infile, "r") as f:
    for line in f:
        if "\n" in line:
            line = line.replace("\n", " ")

My script just doesn't find the carriage return. Am I doing something wrong or making an incorrect assumption about this carriage return? I could just remove it manually in a text editor, but there are about 5000 records in the text file that may also contain this issue.

Further information: The goal here is select two columns from the text file, so I split on \t characters and refer to the values as parts of an array. It works on any line without the returns, but fails on the lines with the returns because, for example, there is no element 9 in those lines.

vals = line.split("\t")
print(vals[0] + " " + vals[9])

So, for the line of text above, this code fails because there is no index 9 in that particular array. For lines of text that don't have the [CR][LF], it works as expected.

Answer

mrcoulson picture mrcoulson · Jul 16, 2013

Technically, there is an answer!

with open(filetoread, "rb") as inf:
    with open(filetowrite, "w") as fixed:
        for line in inf:
            fixed.write(line)

The b in open(filetoread, "rb") apparently opens the file in such a way that I can access those line breaks and remove them. This answer actually came from Stack Overflow user Kenneth Reitz off the site.

Thanks everyone!