Python Regex Negative Lookbehind

Matt Elson picture Matt Elson · Dec 19, 2012 · Viewed 11.8k times · Source

The pattern (?<!(asp|php|jsp))\?.* works in PCRE, but it doesn't work in Python.

So what can I do to get this regex working in Python? (Python 2.7)

Answer

Martin Ender picture Martin Ender · Dec 19, 2012

It works perfectly fine for me. Are you maybe using it wrong? Make sure to use re.search instead of re.match:

>>> import re
>>> s = 'somestring.asp?1=123'
>>> re.search(r"(?<!(asp|php|jsp))\?.*", s)
>>> s = 'somestring.xml?1=123'
>>> re.search(r"(?<!(asp|php|jsp))\?.*", s)
<_sre.SRE_Match object at 0x0000000002DCB098>

Which is exactly how your pattern should behave. As glglgl mentioned, you can get the match if you assign that Match object to a variable (say m) and then call m.group(). That yields ?1=123.

By the way, you can leave out the inner parentheses. This pattern is equivalent:

(?<!asp|php|jsp)\?.*