How can I use lookbehind in a C# Regex in order to skip matches of repeated prefix patterns?

luvieere picture luvieere · Oct 1, 2010 · Viewed 12.9k times · Source

How can I use lookbehind in a C# Regex in order to skip matches of repeated prefix patterns?

Example - I'm trying to have the expression match all the b characters following any number of a characters:

Regex expression = new Regex("(?<=a).*");

foreach (Match result in expression.Matches("aaabbbb"))
  MessageBox.Show(result.Value);

returns aabbbb, the lookbehind matching only an a. How can I make it so that it would match all the as in the beginning?

I've tried

Regex expression = new Regex("(?<=a+).*");

and

Regex expression = new Regex("(?<=a)+.*");

with no results...

What I'm expecting is bbbb.

Answer

John Gietzen picture John Gietzen · Oct 1, 2010

Are you looking for a repeated capturing group?

(.)\1*

This will return two matches.

Given:

aaabbbb

This will result in:

aaa
bbbb

This:

(?<=(.))(?!\1).*

Uses the above principal, first checking that the finding the previous character, capturing it into a back reference, and then asserting that that character is not the next character.

That matches:

bbbb