Representing identifiers using Regular Expression

Jeris Alan picture Jeris Alan · Feb 19, 2013 · Viewed 34.9k times · Source

The regular definition for recognizing identifiers in C programming language is given by

letter -> a|b|...z|A|B|...|Z|_
digit -> 0|1|...|9
identifier -> letter(letter|digit)*

This definition will generate identifiers of the form

identifier: [_a-zA-Z][_a-zA-Z0-9]*

My question now is how do you limit the length of the identifier that can be generated to not more than 31 characters. What changes need to be made in the regular definition or how to write a regular expression to limit it to not more than the specified length. Could anyone please help. Thanks.

Answer

Oscar Mederos picture Oscar Mederos · Feb 19, 2013

The regular expression you are looking for is:

[_a-zA-Z][_a-zA-Z0-9]{0,30}

It will match an underscore or letter following by X underscores, letters or numbers, where 0 <= X <= 30