What's the difference between parse trees and abstract syntax trees?

Shashi Bhushan picture Shashi Bhushan · May 11, 2011 · Viewed 51.9k times · Source

I found the two terms in a compiler design book, and I'd like to know what each stands for, and how they are different.

I searched on the internet and found that parse trees are also called concrete syntax trees (CSTs).

Answer

Guy Coder picture Guy Coder · Apr 16, 2012

This is based on the Expression Evaluator grammar by Terrence Parr.

The grammar for this example:

grammar Expr002;

options 
{
    output=AST;
    ASTLabelType=CommonTree; // type of $stat.tree ref etc...
}

prog    :   ( stat )+ ;

stat    :   expr NEWLINE        -> expr
        |   ID '=' expr NEWLINE -> ^('=' ID expr)
        |   NEWLINE             ->
        ;

expr    :   multExpr (( '+'^ | '-'^ ) multExpr)*
        ; 

multExpr
        :   atom ('*'^ atom)*
        ; 

atom    :   INT 
        |   ID
        |   '('! expr ')'!
        ;

ID      : ('a'..'z' | 'A'..'Z' )+ ;
INT     : '0'..'9'+ ;
NEWLINE : '\r'? '\n' ;
WS      : ( ' ' | '\t' )+ { skip(); } ;

Input

x=1
y=2
3*(x+y)

Parse Tree

The parse tree is a concrete representation of the input. The parse tree retains all of the information of the input. The empty boxes represent whitespace, i.e. end of line.

Parse Tree

AST

The AST is an abstract representation of the input. Notice that parens are not present in the AST because the associations are derivable from the tree structure.

AST

EDIT

For a more through explanation see Compilers and Compiler Generators by P.D. Terry pg. 23. Also see the authors home page for more items such as source code.