The TestDriver in ANTLRWorks2 seems kind of finicky about when it'll accept a grammer without and explicit EOF
and when it will not. The Hello grammar in the ANTLR4 Getting Started Guide doesn't use EOF
anywhere, so I inferred that it's better to avoid explicit EOF
if possible.
What is the best practice for using EOF
? When do you actually need it?
You should include an explicit EOF
at the end of your entry rule any time you are trying to parse an entire input file. If you do not include the EOF
, it means you are not trying to parse the entire input, and it's acceptable to parse only a portion of the input if it means avoiding a syntax error.
For example, consider the following rule:
file : item*;
This rule means "Parse as many item
elements as possible, and then stop." In other words, this rule will never attempt to recover from a syntax error because it will always assume that the syntax error is part of some syntactic construct that's beyond the scope of the file
rule. Syntax errors will not even be reported, because the parser will simply stop.
If instead I had the following rule:
file : item* EOF;
In means "A file consists exactly of a sequence of zero-or-more item
elements." If a syntax error is reached while parsing an item
element, this rule will attempt to recover from (and report) the syntax error and continue because the EOF
is required and has not yet been reached.
For rules where you are only trying to parse a portion of the input, ANTLR 4 often works, but not always. The following issue describes a technical problem where ANTLR 4 does not always make the correct decision if the EOF
is omitted.
https://github.com/antlr/antlr4/issues/118
Unfortunately the performance impact of this change is substantial, so until that is resolved there will be edge cases that do not behave as you expect.