Semantic analysis in compilers

Overflowh picture Overflowh · Jan 3, 2012 · Viewed 8.1k times · Source

How is the semantic analysis done by a compiler (generally)?

I had to answer to this question during my last exam, it wasn't enough for the professor.

I included BNF (with an example) and syntactic cards in my answer, to which he asked me: "What happens when the compiler finds a statement like int i;?"

Answer

Ira Baxter picture Ira Baxter · Jan 3, 2012

Time to read Aho&Ullman/Dragon book carefully.

Semantic analysis is the activity of a compiler to determine what the types of various values are, how those types interact in expressions, and whether those interactions are semantically reasonable. For instance, you can't reasonably multiply a string by class name, although no editor will stop you from writing

 "abc" * MyClass

To do this, the compiler must first identify declarations and scopes, and typically records the result of this step in a set of symbol tables. This tells it what specific identifiers means in specific contexts. It must also determine the types of various literal constants; "abc" is a different type than 12.2e-5.

Then it must visit all locations where identifiers and literals are used, and verify that the use of the identifier/literal, and the results computed, are compatible with the language definition (as in the above example).

As to how this is done: typically the source code is parsed, some representation of the program is constructed (syntax trees are very popular), and that representation is walked ("visited") element by element to collect/validate the semantic information. The symbol table is usually just a set of hash tables associated with the syntax tree representing a scope, hashing from identifiers to structures containing type declarations.