How to avoid Python fileinput buffering

John Zwinck picture John Zwinck · May 17, 2011 · Viewed 9.5k times · Source

Possible Duplicate:
Setting smaller buffer size for sys.stdin?

I have a Python (2.4/2.7) script using fileinput to read from standard input or from files. It's easy to use, and works well except for one case:

tail -f log | filter.py

The problem is that my script buffers its input, whereas (at least in this case) I want to see its output right away. This seems to stem from the fact that fileinput uses readlines() to grab up to its bufsize worth of bytes before it does anything. I tried using a bufsize of 1 and it didn't seem to help (which was somewhat surprising).

I did find that I can write code like this which does not buffer:

while 1:
    line = sys.stdin.readline()
    if not line: break
    sys.stdout.write(line)

The problem with doing it this way is that I lose the fileinput functionality (namely that it automatically opens all the files passed to my program, or stdin if none, and it can even decompress input files automatically).

So how can I have the best of both? Ideally something where I don't need to explicitly manage my input file list (including decompression), and yet which doesn't delay input when used in a "streaming" way.

Answer

9000 picture 9000 · May 17, 2011

Try running python -u; man says that it will "force stdin, stdout and stderr to be totally unbuffered".

You can just alter the hashbang path at the first line of filter.py.