Is there a working C++ grammar file for ANTLR?

c14ppy picture c14ppy · Feb 24, 2010 · Viewed 7k times · Source

Are there any existing C++ grammar files for ANTLR?

I'm looking to lex, not parse some C++ source code files.

I've looked on the ANTLR grammar page and it looks like there is one listed created by Sun Microsystems here.

However, it seems to be a generated Parser.

Can anyone point me to a C++ ANTLR lexer or grammar file?

Answer

Ira Baxter picture Ira Baxter · Feb 24, 2010

C++ parsers are tough to build.

I can't speak with experience about using ANTLR's C++ grammars. Here I discuss what I learned by reading the notes attached to the the one I did see at the ANTLR site; in essence, the author produced an incomplete grammar. And that was for just C++98. It has been awhile since I looked; there may be others.

Our DMS Software Reengineering Toolkit has a robust C++ front end.

The lexer handles all the cruft for ANSI, GCC3, MS Visual Studio 2008, including large-precision floating point numbers, etc.

[EDIT: 12/2011. Now handles C++11 and OpenMP directives]

[EDIT: 3/2015: Now handles C++14 in both GCC and MS variants. See some parse trees here on SO]

Having "just" a parser is actually not very useful. Above and beyond "just parsing", our front end will build ASTs, build accurate symbol tables (for C++, this is extremely hard to do), perform function-local flow analysis, and allow you to carry out program transformations, etc. See Life After Parsing.

[EDIT: 5/2019: Now handles C++17 in ANSI, GCC and MS variants. Does complete name and type resolution across compilation units. Used to automate large scale God-class refactoring/splitting across systems of 3000 compilation units.]