Stripping Commas and Periods

Keyfer Mathewson picture Keyfer Mathewson · Mar 20, 2013 · Viewed 27.5k times · Source

I am currently trying to input a text file, separate each word and organize them into a list.

The current problem I'm having is getting rid of commas and periods from the text file.

My code is below:

#Process a '*.txt' file.
def Process():
    name = input("What is the name of the file you would like to read from? ")

    file = open( name , "r" )
    text = [word for line in file for word in line.lower().split()]
    word = word.replace(",", "")
    word = word.replace(".", "")

    print(text)

The output I'm currently getting is this:

['this', 'is', 'the', 'first', 'line', 'of', 'the', 'file.', 'this', 'is', 'the', 'second', 'line.']

As you can see, the words "file" and "line" have a period at the end of them.

The text file I'm reading is:

This is the first line of the file.

This is the second line.

Thanks in advance.

Answer

jamylak picture jamylak · Mar 20, 2013

These lines have no effect

word = word.replace(",", "")
word = word.replace(".", "")

just change your list comp to this:

[word.replace(",", "").replace(".", "") 
 for line in file for word in line.lower().split()]