line-by-line file processing, for-loop vs with

Levon picture Levon · Jun 21, 2012 · Viewed 31.1k times · Source

I am trying to understand the trade offs/differences between these to ways of opening files for line-by-line processing

with open('data.txt') as inf:
    for line in inf:
       #etc

vs

for line in open('data.txt'):
   # etc

I understand that using with ensures the file is closed when the "with-block" (suite?) is exited (or an exception is countered). So I have been using with ever since I learned about it here.

Re for-loop: From searching around the net and SO, it seems that whether the file is closed when the for-loop is exited is implementation dependent? And I couldn't find anything about how this construct would deal with exceptions. Does anyone know?

If I am mistaken about anything above, I'd appreciate corrections, otherwise is there a reason to ever use the for construct over the with? (Assuming you have a choice, i.e., aren't limited by Python version)

Answer

John La Rooy picture John La Rooy · Jun 21, 2012

The problem with this

for line in open('data.txt'):
   # etc

Is that you don't keep an explicit reference to the open file, so how do you close it? The lazy way is wait for the garbage collector to clean it up, but that may mean that the resources aren't freed in a timely manner.

So you can say

inf = open('data.txt')
for line in inf:
   # etc
inf.close()

Now what happens if there is an exception while you are inside the for loop? The file won't get closed explicitly.

Add a try/finally

inf = open('data.txt')
try:
    for line in inf:
       # etc
finally:
    inf.close()

This is a lot of code to do something pretty simple, so Python added with to enable this code to be written in a more readable way. Which gets us to here

with open('data.txt') as inf:
    for line in inf:
       #etc

So, that is the preferred way to open the file. If your Python is too old for the with statement, you should use the try/finally version for production code