I am trying to understand the trade offs/differences between these to ways of opening files for line-by-line processing
with open('data.txt') as inf:
for line in inf:
#etc
vs
for line in open('data.txt'):
# etc
I understand that using with
ensures the file is closed when the
"with-block" (suite?) is exited (or an exception is countered). So I have been using with
ever since I learned about it here.
Re for
-loop: From searching around the net and SO, it seems that whether the file
is closed when the for
-loop is exited is implementation dependent? And
I couldn't find anything about how this construct would deal with
exceptions. Does anyone know?
If I am mistaken about anything above, I'd appreciate corrections,
otherwise is there a reason to ever use the for
construct over the
with
? (Assuming you have a choice, i.e., aren't limited by Python version)
The problem with this
for line in open('data.txt'):
# etc
Is that you don't keep an explicit reference to the open file, so how do you close it? The lazy way is wait for the garbage collector to clean it up, but that may mean that the resources aren't freed in a timely manner.
So you can say
inf = open('data.txt')
for line in inf:
# etc
inf.close()
Now what happens if there is an exception while you are inside the for loop? The file won't get closed explicitly.
Add a try/finally
inf = open('data.txt')
try:
for line in inf:
# etc
finally:
inf.close()
This is a lot of code to do something pretty simple, so Python added with
to enable this code to be written in a more readable way. Which gets us to here
with open('data.txt') as inf:
for line in inf:
#etc
So, that is the preferred way to open the file. If your Python is too old for the with statement, you should use the try/finally
version for production code