Context free grammar for non-palindrome

Abhijith Madhav picture Abhijith Madhav · Jun 27, 2011 · Viewed 19.8k times · Source

I need a CFG which will generate strings other than palindromes. The solution has been provided and is as below.(Introduction to theory of computation - Sipser)

R -> XRX | S
S -> aTb | bTa
T -> XTX | X | <epsilon>
X -> a | b

I get the general idea of how this grammar works. It mandates the insertion of a sub-string which has corresponding non-equal alphabets on its either half, through the production S -> aTb | bTa, thus ensuring that a palindrome could never be generated.

I will write down the semantics of the first two productions as I have understood it,

  • S generates strings which cannot be palindromes because their 1st and last alphabets are not equal
  • R consists of at-least one S as a sub-string ensuring that it is never a palindrome.

I don't completely understand the semantics of the third production, i.e. .

   T -> XTX | X | <epsilon>
   X -> a | b

The way I see it, T can generate any combination of a and b, i.e. {a, b}*. Why could it not have been like

T -> XT | <epsilon>
X -> a | b

Aren't the two equivalent? As the later is more intuitive, why isn't it used?

Answer

Jeffrey L Whitledge picture Jeffrey L Whitledge · Jun 27, 2011

The definition of T in that grammer does indeed appear to be unnecessary complication. T can generate any string of as and bs, so the simpler definition would have been just as good.

I can only guess that the productions are given as they are because of the sausage-factory nature of writing a book.

ORIGINAL WRONG ANSWER:

They are not equivalent, because X itself cannot be <epsilon>, and T is not any combination of a and b. T can only expand to a palindrome (including the empty palindrome, a single character, or a palindrome with an unpaired central character).

If X could be empty, then T could expand to anything, but it can't.

NOTE

This answer is based on the supposition that the author’s intention for the production T -> XTX is that the two identical non-terminals in the substitution must represent identical strings of characters. Since I don't have the text to look at, I don't know if this assumption is well-founded except that it is motivated by the question itself. This supposition could be a mistake by the author if that is not the case elsewhere. I think that, in general, this requirement is not true of context-free grammers.

The correct productions would be:

R -> aRa | bRb | S
S -> aTb | bTa
T -> aTa | bTb | a | b | <epsilon>