Start states in Lex / Flex

Dan picture Dan · Jul 15, 2009 · Viewed 14.3k times · Source

I'm using Flex and Bison for a parser generator, but having problems with the start states in my scanner.

I'm using exclusive rules to deal with commenting, but this grammar doesn't seem to match quoted tokens:

%x COMMENT

//                    { BEGIN(COMMENT); }
<COMMENT>[^\n]        ;
<COMMENT>\n           { BEGIN(INITIAL); }

"=="                  { return EQUALEQUAL; }

.                     ;

In this simple example the line:

// a == b

isn't matched entirely as a comment, unless I include this rule:

<COMMENT>"=="             ;

How do I get round this without having to add all these tokens into my exclusive rules?

Answer

Aiden Bell picture Aiden Bell · Jul 15, 2009

Matching C-style comments in Lex/Flex or whatever is well documented:

in the documentation, as well as various variations around the Internet.

Here is a variation on that found in the Flex documentation:

   <INITIAL>{
     "//"              BEGIN(IN_COMMENT);
     }
     <IN_COMMENT>{
     \n      BEGIN(INITIAL);
     [^\n]+    // eat comment
     "/"       // eat the lone /
     }