I am using subprocess.run()
for some automated testing. Mostly to automate doing:
dummy.exe < file.txt > foo.txt
diff file.txt foo.txt
If you execute the above redirection in a shell, the two files are always identical. But whenever file.txt
is too long, the below Python code does not return the correct result.
This is the Python code:
import subprocess
import sys
def main(argv):
exe_path = r'dummy.exe'
file_path = r'file.txt'
with open(file_path, 'r') as test_file:
stdin = test_file.read().strip()
p = subprocess.run([exe_path], input=stdin, stdout=subprocess.PIPE, universal_newlines=True)
out = p.stdout.strip()
err = p.stderr
if stdin == out:
print('OK')
else:
print('failed: ' + out)
if __name__ == "__main__":
main(sys.argv[1:])
Here is the C++ code in dummy.cc
:
#include <iostream>
int main()
{
int size, count, a, b;
std::cin >> size;
std::cin >> count;
std::cout << size << " " << count << std::endl;
for (int i = 0; i < count; ++i)
{
std::cin >> a >> b;
std::cout << a << " " << b << std::endl;
}
}
file.txt
can be anything like this:
1 100000
0 417
0 842
0 919
...
The second integer on the first line is the number of lines following, hence here file.txt
will be 100,001 lines long.
Question: Am I misusing subprocess.run() ?
Edit
My exact Python code after comment (newlines,rb) is taken into account:
import subprocess
import sys
import os
def main(argv):
base_dir = os.path.dirname(__file__)
exe_path = os.path.join(base_dir, 'dummy.exe')
file_path = os.path.join(base_dir, 'infile.txt')
out_path = os.path.join(base_dir, 'outfile.txt')
with open(file_path, 'rb') as test_file:
stdin = test_file.read().strip()
p = subprocess.run([exe_path], input=stdin, stdout=subprocess.PIPE)
out = p.stdout.strip()
if stdin == out:
print('OK')
else:
with open(out_path, "wb") as text_file:
text_file.write(out)
if __name__ == "__main__":
main(sys.argv[1:])
Here is the first diff:
Here is the input file: https://drive.google.com/open?id=0B--mU_EsNUGTR3VKaktvQVNtLTQ
To reproduce, the shell command:
subprocess.run("dummy.exe < file.txt > foo.txt", shell=True, check=True)
without the shell in Python:
with open('file.txt', 'rb', 0) as input_file, \
open('foo.txt', 'wb', 0) as output_file:
subprocess.run(["dummy.exe"], stdin=input_file, stdout=output_file, check=True)
It works with arbitrary large files.
You could use subprocess.check_call()
in this case (available since Python 2), instead of subprocess.run()
that is available only in Python 3.5+.
Works very well thanks. But then why was the original failing ? Pipe buffer size as in Kevin Answer ?
It has nothing to do with OS pipe buffers. The warning from the subprocess docs that @Kevin J. Chase cites is unrelated to subprocess.run()
. You should care about OS pipe buffers only if you use process = Popen()
and manually read()/write() via multiple pipe streams (process.stdin/.stdout/.stderr
).
It turns out that the observed behavior is due to Windows bug in the Universal CRT. Here's the same issue that is reproduced without Python: Why would redirection work where piping fails?
As said in the bug description, to workaround it:
ReadFile()
directly instead of std::cin
g++
on WindowsThe bug affects only text pipes i.e., the code that uses <>
should be fine (stdin=input_file, stdout=output_file
should still work or it is some other bug).