Reading from a frequently updated file

JimS picture JimS · Mar 24, 2011 · Viewed 57.2k times · Source

I'm currently writing a program in python on a Linux system. The objective is to read a log file and execute a bash command upon finding a particular string. The log file is being constantly written to by another program.

My question: If I open the file using the open() method will my Python file object be updated as the actual file gets written to by the other program or will I have to reopen the file at timed intervals?

UPDATE: Thanks for answers so far. I perhaps should have mentioned that the file is being written to by a Java EE app so I have no control over when data gets written to it. I've currently got a program that reopens the file every 10 seconds and tries to read from the byte position in the file that it last read up to. For the moment it just prints out the string that's returned. I was hoping that the file did not need to be reopened but the read command would somehow have access to the data written to the file by the Java app.

#!/usr/bin/python
import time

fileBytePos = 0
while True:
    inFile = open('./server.log','r')
    inFile.seek(fileBytePos)
    data = inFile.read()
    print data
    fileBytePos = inFile.tell()
    print fileBytePos
    inFile.close()
    time.sleep(10)

Thanks for the tips on pyinotify and generators. I'm going to have a look at these for a nicer solution.

Answer

Jeff Bauer picture Jeff Bauer · Mar 24, 2011

I would recommend looking at David Beazley's Generator Tricks for Python, especially Part 5: Processing Infinite Data. It will handle the Python equivalent of a tail -f logfile command in real-time.

# follow.py
#
# Follow a file like tail -f.

import time
def follow(thefile):
    thefile.seek(0,2)
    while True:
        line = thefile.readline()
        if not line:
            time.sleep(0.1)
            continue
        yield line

if __name__ == '__main__':
    logfile = open("run/foo/access-log","r")
    loglines = follow(logfile)
    for line in loglines:
        print line,