Why can't I call read() twice on an open file?

helpermethod picture helpermethod · Oct 11, 2010 · Viewed 71.5k times · Source

For an exercise I'm doing, I'm trying to read the contents of a given file twice using the read() method. Strangely, when I call it the second time, it doesn't seem to return the file content as a string?

Here's the code

f = f.open()

# get the year
match = re.search(r'Popularity in (\d+)', f.read())

if match:
  print match.group(1)

# get all the names
matches = re.findall(r'<td>(\d+)</td><td>(\w+)</td><td>(\w+)</td>', f.read())

if matches:
  # matches is always None

Of course I know that this is not the most efficient or best way, this is not the point here. The point is, why can't I call read() twice? Do I have to reset the file handle? Or close / reopen the file in order to do that?

Answer

Tim picture Tim · Oct 11, 2010

Calling read() reads through the entire file and leaves the read cursor at the end of the file (with nothing more to read). If you are looking to read a certain number of lines at a time you could use readline(), readlines() or iterate through lines with for line in handle:.

To answer your question directly, once a file has been read, with read() you can use seek(0) to return the read cursor to the start of the file (docs are here). If you know the file isn't going to be too large, you can also save the read() output to a variable, using it in your findall expressions.

Ps. Dont forget to close the file after you are done with it ;)