A notation for empty right-hand sides of rules

akim picture akim · Feb 1, 2013 · Viewed 10.6k times · Source

When writing a ("theoretical") grammar with a rule with an empty right-hand side, one always use a symbol such as ε (or 1) to make this emptiness explicit:

A → ε | a A

Such a grammar in Yacc and others would then look like

a: | 'a' a

or "worse"

a:       { $$ = new_list(); }
 | a 'a' { $$ = $1; $$->append($1); }
 ;

The fact that in "real world grammars" (Yacc, Bison, etc.) this empty right-hand side part of the rule is not explicitly marked as empty troubles me: it is easy to miss the fact that an rhs is empty, or worse: to forget to insert | and actually use a mid-rule action:

a:       { $$ = new_list(); }
   a 'a' { $$ = $1; $$->append($1); }
 ;

1) I don't know of any tool that provides a means to make empty rhs explicit. Are there any?

Future versions of Bison might support a dedicated symbol, with errors when used in a non-empty rhs, and warnings when a implicitly empty rhs is left.

2) Do people consider this useful?

3) What would be the notation you'd suggest?

Currently, the candidate is $empty:

a: $empty { $$ = new_list(); }
 | a 'a'  { $$ = $1; $$->append($1); }
 ;

EDIT

The chosen syntax is %empty:

a: %empty { $$ = new_list(); }
 | a 'a'  { $$ = $1; $$->append($1); }
 ;

Indeed $empty looks like a pseudo-symbol, such as $accept that Bison generates for the initial rule, or the $@n pseudo-symbols for mid-rule actions, or $eof for, well, end-of-file. But it's definitely not a symbol, it is precisely the absence of symbols.

On the other hand % clearly denotes a directive (some kind of attribute/metadata), like %pred.

So it's a minor difference of syntax, but it's more consistent with the overall syntax. Credit goes to Joel E. Denny.

Answer

Chris Dodd picture Chris Dodd · Feb 1, 2013

I usually just use a comment:

a: /*epsilon*/ { $$ = new_list(); }
 | a 'a'  { $$ = $1; $$->append($1); }
 ;

Works fine with no changes and makes the intent clear....

IMO, this comes under the heading "If it ain't broke, don't fix it"