I am having trouble with pexpect
. I'm trying to grab output from tralics
which reads in latex equations and emits the MathML representation, like this:
1 ~/ % tralics --interactivemath
This is tralics 2.14.5, a LaTeX to XML translator, running on tlocal
Copyright INRIA/MIAOU/APICS/MARELLE 2002-2012, Jos\'e Grimm
Licensed under the CeCILL Free Software Licensing Agreement
Starting translation of file texput.tex.
No configuration file.
> $x+y=z$
<formula type='inline'><math xmlns='http://www.w3.org/1998/Math/MathML'><mrow><mi>x</mi> <mo>+</mo><mi>y</mi><mo>=</mo><mi>z</mi></mrow></math></formula>
>
So I try to get the formula using pexpect:
import pexpect
c = pexpect.spawn('tralics --interactivemath')
c.expect('>')
c.sendline('$x+y=z$')
s = c.read_nonblocking(size=2000)
print s
The output has the formula, but with the original input at the beginning and some control chars at the end:
"x+y=z$\r\n<formula type='inline'><math xmlns='http://www.w3.org/1998/Math/MathML'><mrow><mi>x</mi><mo>+</mo><mi>y</mi><mo>=</mo><mi>z</mi></mrow></math></formula>\r\n\r> \x1b[K"
I can clean the output string, but I must be missing something basic. Is there a cleaner way to get the MathML?
From what I understand you are trying to get this from pexpect:
<formula type='inline'><math xmlns='http://www.w3.org/1998/Math/MathML'><mrow><mi>x</mi> <mo>+</mo><mi>y</mi><mo>=</mo><mi>z</mi></mrow></math></formula>
You can use a regexp instead of ">" for the matching in order to get the expected result. This is the easiest example:
c.expect("<formula.*formula>");
After that, you can access the matched string by calling the match attribute of pexpect:
print c.match
You might also try different regexps, due to the fact that the one I posted is a greedy one and it might hinder your execution time if the formulas are big.