Setting the default output encoding in Python 2 is a well-known idiom:
sys.stdout = codecs.getwriter("utf-8")(sys.stdout)
This wraps the sys.stdout
object in a codec writer that encodes output in UTF-8.
However, this technique does not work in Python 3 because sys.stdout.write()
expects a str
, but the result of encoding is bytes
, and an error occurs when codecs
tries to write the encoded bytes to the original sys.stdout
.
What is the correct way to do this in Python 3?
Python 3.1 added io.TextIOBase.detach()
, with a note in the documentation for sys.stdout
:
The standard streams are in text mode by default. To write or read binary data to these, use the underlying binary buffer. For example, to write bytes to
stdout
, usesys.stdout.buffer.write(b'abc')
. Usingio.TextIOBase.detach()
streams can be made binary by default. This function setsstdin
andstdout
to binary:def make_streams_binary(): sys.stdin = sys.stdin.detach() sys.stdout = sys.stdout.detach()
Therefore, the corresponding idiom for Python 3.1 and later is:
sys.stdout = codecs.getwriter("utf-8")(sys.stdout.detach())