How to build a parse tree of a mathematical expression?

bodacydo picture bodacydo · Jul 2, 2014 · Viewed 11.6k times · Source

I'm learning how to write tokenizers, parsers and as an exercise I'm writing a calculator in JavaScript.

I'm using a prase tree approach (I hope I got this term right) to build my calculator. I'm building a tree of tokens based on operator precedence.

For example, given an expression a*b+c*(d*(g-f)) the correct tree would be:

         +
        / \
       *   \
      / \   \
     a   b   \
              *
             / \
            c   *
               / \
              d   -
                 / \
                g   f

Once I've the tree, I can just traverse it down and apply the operations at each root node on the left and right nodes recursively to find the value of the expression.

However, the biggest problem is actually building this tree. I just can't figure out how to do it correctly. I can't just split on operators +, -, / and * and create the tree from the left and right parts because of precedence.

What I've done so far is tokenize the expression. So given a*b+c*(d*(g-f)), I end up with an array of tokens:

[a, Operator*, b, Operator+, c, Operator*, OpenParen, d, Operator*, OpenParen, g, Operator-, f, CloseParen, CloseParen]

However I can't figure out the next step about how to go from this array of tokens to a tree that I can traverse and figure out the value. Can anyone help me with ideas about how to do that?

Answer

scarface382 picture scarface382 · Mar 11, 2017

Right mate, I dont know how to make this look pretty

I wrote a similar program in C but my tree is upside down, meaning new operators become the root.

A calculator parse tree in C code, but read the readme

ex: input 2 + 3 - 4

Start with empty node

{}

Rule 1: when you read a number, append a child either left or right of the current node, whichever one is empty

      {}
     /
   {2}

Rule 2: then you read an operator, you have to climb starting at the current node {2} to an empty node, when you find one, change its value to +, if there are no empty nodes, you must create one then make it the root of the tree

      {+}
     /
   {2}

you encounter another number, go to rule 1, we are currently at {+} find an side that is empty (this time right)

      {+}
     /   \
   {2}   {3}

Now we have new operand '-' but because the parent of {3} is full, you must create new node and make it the root of everything

        {-}
        /
      {+}
     /   \
   {2}   {3}

oh look at that, another number, because we are currently pointing at the root, lets find a child of {-} that is empty, (hint the right side)

         {-}
        /   \
      {+}   {4}
     /   \
   {2}   {3}

Once a tree is built, take a look at the function getresult() to calculate everything

You are probably wondering how Parenthesization works. I did it this way:

I create brand new tree every time I encounter a '(' and build the rest of the input into that tree. If I read another '(', I create another one and continue build with the new tree.

Once input is read, I attach all the trees's roots to one another to make one final tree. Check out the code and readme, I have to draw to explain everything.

Hope this helps future readers as well