dup, dup2, tmpfile and stdout in python

Woltan picture Woltan · Jan 11, 2012 · Viewed 10k times · Source

This is a follow up question from here.


Where I want do go

I would like to be able to temporarily redirect the stdout into a temp file, while python still is able to print to stdout. This would involve the following steps:

  1. Create a copy of stdout (new)
  2. Create a temp file (tmp)
  3. Redirect stdout into tmp
  4. Tell python to use new as stdout
  5. Redirect tmp into the "real" stdout
  6. Tell python to use the "real" stdout again
  7. Read and close tmp

Implementation

I tried to implement the above in the following way:

import os
import subprocess
import sys

#A function that calls an external process to print to stdout as well as
#a python print to pythons stdout.
def Func(s, p = False):
    subprocess.call('echo "{0}"'.format(s), shell = True)
    if p:
        print "print"

sil = list() # <-- Some list to store the content of the temp files

print "0.1" # Some testing of the
Func("0.2") # functionality

new = os.dup(1)    # Create a copy of stdout (new)
tmp = os.tmpfile() # Create a temp file (tmp)

os.dup2(tmp.fileno(), 1)            # Redirect stdout into tmp
sys.stdout = os.fdopen(new, 'w', 0) # Tell python to use new as stdout

Func("0.3", True) # <--- This should print "0.3" to the temp file and "print" to stdout

os.dup2(new, 1)                   # Redirect tmp into "real" stdout
sys.stdout = os.fdopen(1, 'w', 0) # Tell python to use "real" stdout again

# Read and close tmp
tmp.flush()
tmp.seek(0, os.SEEK_SET)
sil.append(tmp.read())
tmp.close()

I would like to take a little break here to summarize.
The output to console up until here should read:

0.1
0.2
print

while sil should look like this: ['0.3\n']. So everything is working like a charm up until here. However, if I redo the script above again like so:

print "1.1" # Some testing of the
Func("1.2") # functionality

new = os.dup(1)    # Create a copy of stdout (new)
tmp = os.tmpfile() # Create a temp file (tmp)

os.dup2(tmp.fileno(), 1)            # Redirect stdout into tmp
sys.stdout = os.fdopen(new, 'w', 0) # Tell python to use new as stdout

# This should print "0.3" to the temp file and "print" to stdout and is the crucial point!
Func("1.3", True) 

os.dup2(new, 1)                   # Redirect tmp into "real" stdout
sys.stdout = os.fdopen(1, 'w', 0) # Tell python to use "real" stdout again

# Read and close tmp
tmp.flush()
tmp.seek(0, os.SEEK_SET)
sil.append(tmp.read())

an error occurs and the output looks like this:

1.1
1.2
/bin/sh: line 0: echo: write error: Bad file descriptor
print

while sil reads: ['0.3\n', ''].

In other words: the second Func("1.3", True) is not able to write to the temp file.

Questions

  1. First of all, I would like to know why my script is not working like I want it to work. Meaning, why is it only possible in the first half of the script to write to the temp file?
  2. I am still a little puzzled by the usage of dup and dup2. While I think I understand how the redirection of stdout into a temp file is working I totally do now know why os.dup2(new, 1) is doing what it is doing. Maybe the answer could elaborate on what all the dup and dup2s in my script are doing^^

Answer

Anders Waldenborg picture Anders Waldenborg · Jan 11, 2012

The reason you get a "bad file descriptor" is that the garbage collector closes the stdout FD for you. Consider these two lines:

sys.stdout = os.fdopen(1, 'w', 0)    # from first part of your script
...
sys.stdout = os.fdopen(new, 'w', 0)  # from second part of your script

Now when the second of those two are executed the first file object's reference count drops to zero and the garbage collector destroys it. File objects close their associated fd when destructed, and that fd happens to be 1 = stdout. So you need to be very careful with how you destroy objects created with os.fdopen.

Here is a small example to show the problem. os.fstat is just used as an example function that triggers the "Bad file descriptor" error when you pass it an closed fd.

import os
whatever = os.fdopen(1, 'w', 0)
os.fstat(1)
del whatever
os.fstat(1)

I actually happen to have a context manager that I think does exactly (or almost atleast, in my case I happen need a named tempfile) what you are looking for. You can see that it reuses the original sys.stdout object to avoid the close problematic.

import sys
import tempfile
import os

class captured_stdout:
    def __init__(self):
        self.prevfd = None
        self.prev = None

    def __enter__(self):
        F = tempfile.NamedTemporaryFile()
        self.prevfd = os.dup(sys.stdout.fileno())
        os.dup2(F.fileno(), sys.stdout.fileno())
        self.prev = sys.stdout
        sys.stdout = os.fdopen(self.prevfd, "w")
        return F

    def __exit__(self, exc_type, exc_value, traceback):
        os.dup2(self.prevfd, self.prev.fileno())
        sys.stdout = self.prev

## 
## Example usage
##

## here is a hack to print directly to stdout
import ctypes
libc=ctypes.LibraryLoader(ctypes.CDLL).LoadLibrary("libc.so.6")
def directfdprint(s):
    libc.write(1, s, len(s))


print("I'm printing from python before capture")
directfdprint("I'm printing from libc before captrue\n")

with captured_stdout() as E:
    print("I'm printing from python in capture")
    directfdprint("I'm printing from libc in capture\n")

print("I'm printing from python after capture")
directfdprint("I'm printing from libc after captrue\n")

print("Capture contains: " + repr(file(E.name).read()))