LR(1) Item DFA - Computing Lookaheads

mrjasmin picture mrjasmin · Dec 31, 2012 · Viewed 16.5k times · Source

I have trouble understanding how to compute the lookaheads for the LR(1)-items.

Lets say that I have this grammar:

S -> AB
A -> aAb | a
B -> d

A LR(1)-item is an LR(0) item with a lookahead. So we will get the following LR(0)-item for state 0:

S -> .AB , {lookahead} 
A -> .aAb,  {lookahead} 
A -> .a,  {lookahead}

State: 1

A ->  a.Ab, {lookahead} 
A ->  a. ,{lookahead} 
A -> .aAb ,{lookahead} 
A ->.a ,{lookahead}

Can somebody explain how to compute the lookaheads ? What is the general approach ?

Thank you in advance

Answer

templatetypedef picture templatetypedef · Jan 2, 2013

The lookaheads used in an LR(1) parser are computed as follows. First, the start state has an item of the form

S -> .w  ($)

for every production S -> w, where S is the start symbol. Here, the $ marker denotes the end of the input.

Next, for any state that contains an item of the form A -> x.By (t), where x is an arbitrary string of terminals and nonterminals and B is a nonterminal, you add an item of the form B -> .w (s) for every production B -> w and for every terminal in the set FIRST(yt). (Here, FIRST refers to FIRST sets, which are usually introduced when talking about LL parsers. If you haven't seen them before, I would take a few minutes to look over those lecture notes).

Let's try this out on your grammar. We start off by creating an item set containing

S -> .AB ($)

Next, using our second rule, for every production of A, we add in a new item corresponding to that production and with lookaheads of every terminal in FIRST(B$). Since B always produces the string d, FIRST(B$) = d, so all of the productions we introduce will have lookahead d. This gives

S -> .AB ($)
A -> .aAb (d)
A -> .a (d)

Now, let's build the state corresponding to seeing an 'a' in this initial state. We start by moving the dot over one step for each production that starts with a:

A -> a.Ab (d)
A -> a. (d)

Now, since the first item has a dot before a nonterminal, we use our rule to add one item for each production of A, giving those items lookahead FIRST(bd) = b. This gives

A -> a.Ab (d)
A -> a. (d)
A -> .aAb (b)
A -> .a (b)

Continuing this process will ultimately construct all the LR(1) states for this LR(1) parser. This is shown here:

[0]
S -> .AB  ($)
A -> .aAb (d)
A -> .a   (d)

[1]
A -> a.Ab (d)
A -> a.   (d)
A -> .aAb (b)
A -> .a   (b)

[2]
A -> a.Ab (b)
A -> a.   (b)
A -> .aAb (b)
A -> .a   (b)

[3]
A -> aA.b (d)

[4]
A -> aAb. (d)

[5]
S -> A.B  ($)
B -> .d   ($)

[6]
B -> d.   ($)

[7]
S -> AB.  ($)

[8]
A -> aA.b (b)

[9]
A -> aAb. (b)

In case it helps, I taught a compilers course last summer and have all the lecture slides available online. The slides on bottom-up parsing should cover all of the details of LR parsing and parse table construction, and I hope that you find them useful!

Hope this helps!