Lisp grammar in yacc

Paul Nathan picture Paul Nathan · Feb 5, 2009 · Viewed 15.6k times · Source

I am trying to build a Lisp grammar. Easy, right? Apparently not.

I present these inputs and receive errors...

( 1 1)
23 23 23 
ui ui

This is the grammar...

%%
sexpr: atom                 {printf("matched sexpr\n");}
    | list
    ;
list: '(' members ')'       {printf("matched list\n");}
    | '('')'                {printf("matched empty list\n");}
    ;
members: sexpr              {printf("members 1\n");}
    | sexpr members         {printf("members 2\n");}
    ;
atom: ID                    {printf("ID\n");}
    | NUM                   {printf("NUM\n");}
    | STR                   {printf("STR\n");}
    ;
%%

As near as I can tell, I need a single non-terminal defined as a program, upon which the whole parse tree can hang. But I tried it and it didn't seem to work.

edit - this was my "top terminal" approach:

program: slist;

slist: slist sexpr | sexpr;

But it allows problems such as:

( 1 1 

Edit2: The FLEX code is...

%{
    #include <stdio.h>
    #include "a.yacc.tab.h"
    int linenumber;
    extern int yylval;
%}
%%
\n                         { linenumber++; }
[0-9]+                     { yylval = atoi(yytext); return NUM; }
\"[^\"\n]*\"               { return STR; }
[a-zA-Z][a-zA-Z0-9]*       { return ID; }
.
%%

An example of the over-matching...

(1 1 1)
NUM
matched sexpr
NUM
matched sexpr
NUM
matched sexpr
(1 1
NUM
matched sexpr
NUM
matched sexpr

What's the error here?

edit: The error was in the lexer.

Answer

dmitry_vk picture dmitry_vk · Feb 5, 2009

Lisp grammar can not be represented as context-free grammar, and yacc can not parse all lisp code. It is because of lisp features such as read-evaluation and programmable reader. So, in order just to read an arbitrary lisp code, you need to have a full lisp running. This is not some obscure, non-used feature, but it is actually used. E.g., CL-INTERPOL, CL-SQL.

If the goal is to parse a subset of lisp, then the program text is a sequence of sexprs.