I am a complete ANTLR4 newbie, so please forgive my ignorance. I ran into this presentation where a very simple arithmetic expression grammar is defined. It looks like:
grammar Expressions;
start : expr ;
expr : left=expr op=('*'|'/') right=expr #opExpr
| left=expr op=('+'|'-') right=expr #opExpr
| atom=INT #atomExpr
;
INT : ('0'..'9')+ ;
WS : [ \t\r\n]+ -> skip ;
Which is great because it will generate a very simple binary tree that can be traversed using the visitor pattern as explained in the slides, e.g., here's the function that visits the expr
:
public Integer visitOpExpr(OpExprContext ctx) {
int left = visit(ctx.left);
int right = visit(ctx.right);
String op = ctx.op.getText();
switch (op.charAt(0)) {
case '*': return left * right;
case '/': return left / right;
case '+': return left + right;
case '-': return left - right;
default: throw new IllegalArgumentException("Unkown opeator " + op);
}
}
The next thing I would like to add is support for parentheses. So I modified the expr
as follows:
expr : '(' expr ')' #opExpr
| left=expr op=('*'|'/') right=expr #opExpr
| left=expr op=('+'|'-') right=expr #opExpr
| atom=INT #atomExpr
;
Unfortunately, the code above fails because when encountering parentheses the three attributes op
,left
and right
are null (fails with NPE).
I think I could work around that by defining a new attribute, e.g., parenthesized='(' expr ')'
, and then deal with that in the visitor code. However, it seems overkill to me to have a whole extra node type to represent an expression in parentheses. A simpler but uglier solution is to add the following line of code at the beginning of the visitOpExpr
method:
if (ctx.op == null) return visit(ctx.getChild(1)); // 0 and 2 are the parentheses!
I don't like the above at all because it's very fragile and highly dependent on the grammar structure.
I am wondering if there is a way to tell ANTLR to just "eat" the parentheses and treat the expression like a child. Is there? Is there a better way to do this?
Note: My end goal is to extend the example to include boolean expressions that can themselves contain arithmetic expressions, e.g., (2+4*3)/10 >= 11
, that is, a relation (<,>,==,~=,etc.) between arithmetic expressions can define an atomic boolean expression. This is straight forward and I already have the grammar sketched out but I have the same problem with parenthesis, i.e., I need to be able to write stuff like (I will also add support for variables):
((2+4*x)/10 >= 11) | ( x>1 & x<3 )
EDIT: Fixed the precedence of the parenthesized expression, parenthesis always have higher precedence.
Sure, just label it differently. After all, the alternative '(' expr ')'
isn't a #opExpr
:
expr : left=expr op=('*'|'/') right=expr #opExpr
| left=expr op=('+'|'-') right=expr #opExpr
| '(' expr ')' #parenExpr
| atom=INT #atomExpr
;
And in your visitor, you'd do something like this:
public class EvalVisitor extends ExpressionsBaseVisitor<Integer> {
@Override
public Integer visitOpExpr(@NotNull ExpressionsParser.OpExprContext ctx) {
int left = visit(ctx.left);
int right = visit(ctx.right);
String op = ctx.op.getText();
switch (op.charAt(0)) {
case '*': return left * right;
case '/': return left / right;
case '+': return left + right;
case '-': return left - right;
default: throw new IllegalArgumentException("Unknown operator " + op);
}
}
@Override
public Integer visitStart(@NotNull ExpressionsParser.StartContext ctx) {
return this.visit(ctx.expr());
}
@Override
public Integer visitAtomExpr(@NotNull ExpressionsParser.AtomExprContext ctx) {
return Integer.valueOf(ctx.getText());
}
@Override
public Integer visitParenExpr(@NotNull ExpressionsParser.ParenExprContext ctx) {
return this.visit(ctx.expr());
}
public static void main(String[] args) {
String expression = "2 * (3 + 4)";
ExpressionsLexer lexer = new ExpressionsLexer(new ANTLRInputStream(expression));
ExpressionsParser parser = new ExpressionsParser(new CommonTokenStream(lexer));
ParseTree tree = parser.start();
Integer answer = new EvalVisitor().visit(tree);
System.out.printf("%s = %s\n", expression, answer);
}
}
If you run the class above, you'd see the following output:
2 * (3 + 4) = 14