Regex - lookahead assertion

luk4443 picture luk4443 · Sep 26, 2010 · Viewed 8.2k times · Source

I have problem with lookahead assertion (?=). For example, I have expression:

/Win(?=2000)/

It match Win, if expression is like Win2000, Win2000fgF. I have next expression:

^(?=.*\d)(?=.*[a-z]).*$

It match for digit and lower case letter, for example: 45dF, 4Dd. But I don't know, why it works and match all characters :) I haven't characters, which are before (?=.*\d). I think, only this expression should work:

^.\*(?=.*\d)(?=.*[a-z]).*$

(with \* before expression).

Could you explain it?

Answer

Tim Pietzcker picture Tim Pietzcker · Sep 26, 2010

Let's say we are the regex engine and apply the regex ^(?=.*\d)(?=.*[a-z]).*$ to the string 2a.

Starting at position 0 (before the first character):

  1. ^: Make sure we're at the start of the string: OK
  2. (?=: Let's check if the following regex could match...
  3. .*: match any number of characters -> 2a. OK.
  4. \d: Nope, we're already at the end. Let's go back one character: a --> No, doesn't match. Go back another one: 2 --> MATCH!
  5. ): End of lookahead, match successful. We're still at position 0!
  6. (?=: Let's check if the following regex could match...
  7. .*: match any number of characters -> 2a. OK.
  8. [a-z]: Nope, we're already at the end. Let's go back one character: a --> MATCH!
  9. ): End of lookahead, match successful. We're still at position 0!
  10. .*: match any number of characters -> 2a --> MATCH!
  11. $: Let's see - are we at the end of the string? Yes, we are! --> MATCH!
  12. Hey, we've reached the end of the regex. Great. Match completed!