Setting up Cup/JLex parsing properly

user2192677 picture user2192677 · Jun 9, 2013 · Viewed 7.3k times · Source

I have a very basic lexer here:

import java_cup.runtime.*;
import java.io.IOException;

%%

%class AnalyzerLex

%function next_token
%type java_cup.runtime.Symbol

%unicode
//%line
//%column

// %public
%final
// %abstract

%cupsym sym
%cup
%cupdebug

%eofval{
  return sym(sym.EOF);
%eofval}

%init{
    // TODO: code that goes to constructor
%init}

%{
    private Symbol sym(int type)
    {
        return sym(type, yytext());
    }

    private Symbol sym(int type, Object value)
    {
        return new Symbol(type, yyline, yycolumn, value);
    }

    private void error()
    throws IOException
    {
        throw new IOException("Illegal text at line = "+yyline+", column = "+yycolumn+", text = '"+yytext()+"'");
    }
%}

ANY = .

%%

{ANY}       { return sym(sym.ANY); }
"\n" { }

And this is my very basic parser:

import java_cup.runtime.*;

parser code
{:

    public void syntax_error(Symbol cur_token) {
        System.err.println("syntax_error " + cur_token );
    }

:}

action code
{:
:}

terminal        ANY;

non terminal    grammar;

grammar         ::= ANY : a
                {:
                //System.out.println(a);
                :}
                ;

I am trying to parse a sample file. I made a method like this:

AnalyzerLex scanner = null;
        ParserCup pc = null;
        try {
          scanner = new AnalyzerLex( new java.io.FileReader(argv[i]) );
          pc = new ParserCup(scanner);
          while ( !scanner.zzAtEOF ){
              pc.parse_debug();
          }
        }

But the above code throws an error:

    #2
Unexpected exception:
# Initializing parser
# Current Symbol is #2
# Shift under term #2 to state #2
# Current token is #2
syntax_error #2
# Attempting error recovery
# Finding recovery state on stack
# Pop stack by one, state was # 2
# Pop stack by one, state was # 0
# No recovery state found on stack
# Error recovery fails
Couldn't repair and continue parse at character 0 of input
java.lang.Exception: Can't recover from previous error(s)
    at java_cup.runtime.lr_parser.report_fatal_error(lr_parser.java:375)
    at java_cup.runtime.lr_parser.unrecovered_syntax_error(lr_parser.java:424)
    at java_cup.runtime.lr_parser.debug_parse(lr_parser.java:816)
    at AnalyzerLex.main(AnalyzerLex.java:622)

I think that I am setting up the lexer/parser not properly.

Answer

Miguel Jiménez picture Miguel Jiménez · Sep 26, 2013

I am not an expert but I can recommend you to take these actions:

  1. You may have to specify which non terminal to start with, for example:

    start with compilation_unit;
    
  2. You can enhance your syntax error method by adding line and column, that way it is clearer where the error is.

    public void syntax_error(Symbol s){
        System.out.println("compiler has detected a syntax error at line " + s.left 
            + " column " + s.right);
    }