Bison: Optional tokens in a single rule

Simone Margaritelli picture Simone Margaritelli · Apr 19, 2010 · Viewed 8.9k times · Source

i'm using GNU Bison 2.4.2 to write a grammar for a new language i'm working on and i have a question. When i specify a rule, let's say :

statement : T_CLASS T_IDENT  '{' T_CLASS_MEMBERS '}' {
           // create a node for the statement ...
}

If i have a variation on the rule, for instance

statement : T_CLASS T_IDENT T_EXTENDS T_IDENT_LIST  '{' T_CLASS_MEMBERS '}' {
           // create a node for the statement ...
}

Where (from flex scanner rules) :

"class"                     return T_CLASS;
"extends"                   return T_EXTENDS;
[a-zA-Z\_][a-zA-Z0-9\_]*    return T_IDENT;

(and T_IDENT_LIST is a rule for comma separated identifiers).

Is there any way to specify all of this only in one rule, setting somehow the "T_EXTENDS T_IDENT_LIST" as optional? I've already tried with

 T_CLASS T_IDENT (T_EXTENDS T_IDENT_LIST)? '{' T_CLASS_MEMBERS '}' {
     // create a node for the statement ...
 } 

But Bison gave me an error.

Thanks

Answer

Jerry Coffin picture Jerry Coffin · Apr 19, 2010

To make a long story short, no. Bison only deals with LALR(1) grammars, which means it only uses one symbol of lookahead. What you need is something like this:

statement: T_CLASS T_IDENT extension_list '{' ...

extension_list: 
              | T_EXTENDS T_IDENT_LIST
              ;

There are other parser generators that work with more general grammars though. If memory serves, some of them support optional elements relatively directly like you're asking for.