Why does Popen.communicate() return b'hi\n' instead of 'hi'?

imagineerThat picture imagineerThat · Mar 13, 2013 · Viewed 77.2k times · Source

Can someone explain why the result I want, "hi", is preceded with a letter 'b' and followed with a newline?

I am using Python 3.3

>>> import subprocess
>>> print(subprocess.Popen("echo hi", shell=True,
                           stdout=subprocess.PIPE).communicate()[0])
b'hi\n'

This extra 'b' does not appear if I run it with python 2.7

Answer

zigg picture zigg · Mar 13, 2013

The b indicates that what you have is bytes, which is a binary sequence of bytes rather than a string of Unicode characters. Subprocesses output bytes, not characters, so that's what communicate() is returning.

The bytes type is not directly print()able, so you're being shown the repr of the bytes you have. If you know the encoding of the bytes you received from the subprocess, you can use decode() to convert them into a printable str:

>>> print(b'hi\n'.decode('ascii'))
hi

Of course, this specific example only works if you actually are receiving ASCII from the subprocess. If it's not ASCII, you'll get an exception:

>>> print(b'\xff'.decode('ascii'))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xff in position 0…

The newline is part of what echo hi has output. echo's job is to output the parameters you pass it, followed by a newline. If you're not interested in whitespace surrounding the process output, you can use strip() like so:

>>> b'hi\n'.strip()
b'hi'