Is it safe to open a file several times at once in Python?

Ryan Haining picture Ryan Haining · Jan 24, 2013 · Viewed 12.5k times · Source

I seem to recall cases in lower level languages that opening a file more than once in a program could result in a shared seek pointer. By messing around in Python a bit, this doesn't seem to be happening for me:

$ cat file.txt
first line!
second
third
fourth
and fifth
>>> f1 = open('file.txt')
>>> f2 = open('file.txt')
>>> f1.readline()
'first line!\n'
>>> f2.read()
'first line!\nsecond\nthird\nfourth\nand fifth\n'
>>> f1.readline()
'second\n'
>>> f2.read()
''
>>> f2.seek(0)
>>> f1.readline()
'third\n'

Is this behavior known to be safe? I'm having a hard time finding a source saying that it's okay, and it would help a lot if I could depend on this.

I'm not seeing the position as an attribute of the file object, otherwise I'd have more confidence in this. I know it could be kept internally in the iterator, but idk how .tell() would get to it in that case.

>>> dir(f1)
['__class__', '__delattr__', '__doc__', '__getattribute__', '__hash__',
 '__init__', '__iter__', '__new__', '__reduce__', '__reduce_ex__', '__repr__',
 '__setattr__', '__str__', 'close', 'closed', 'encoding', 'fileno', 'flush',
 'isatty', 'mode', 'name', 'newlines', 'next', 'read', 'readinto', 'readline',
 'readlines', 'seek', 'softspace', 'tell', 'truncate', 'write', 'writelines',
 'xreadlines']

UPDATE
On page 161 of The Python Essential Reference it states

The same file can be opened more than once in the same program (or in different programs). Each instance of the open file has its own file pointer that can be manipulated independently.

So it seems to in fact be safe, defined behavior

Answer

kindall picture kindall · Jan 24, 2013

On a modern OS (post-1969 for UNIX-like OSs, or post-2000 for Windows, and probably before that but I'm counting Win2K as the first "modern" Windows), each instance of an open file (file descriptor) has its own seek pointer. There is no magic in Python's file class that would cause instances to share state; file is a wrapper for an ordinary C file handle, which itself encapsulates an OS file descriptor, and the implementation of file.tell() and file.seek() call the corresponding C stdio functions. (For the messy details see CPython's fileobject.c.) There can be differences between the C library behavior and the underlying OS's behavior, but in this particular case that's not a factor.

If you're using IronPython or Jython, it's going to use the standard .Net or Java file object for its underlying implementation, which in turn is going to use the standard C library or OS implementation.

So your approach is fine unless you are somehow running Python on some non-standard OS with bizarre I/O behavior.

You may get unexpected results when writing if you don't flush in a timely manner; data can hang out in memory for some time before it actually hits the disk and is available to the other file descriptors you've opened on the same file. As abarnert points out in a comment, that's problematic anyway, except in very simple cases.