Alternative to the `match = re.match(); if match: ...` idiom?

dbr picture dbr · Jul 20, 2009 · Viewed 23.3k times · Source

If you want to check if something matches a regex, if so, print the first group, you do..

import re
match = re.match("(\d+)g", "123g")
if match is not None:
    print match.group(1)

This is completely pedantic, but the intermediate match variable is a bit annoying..

Languages like Perl do this by creating new $1..$9 variables for match groups, like..

if($blah ~= /(\d+)g/){
    print $1
}

From this reddit comment,

with re_context.match('^blah', s) as match:
    if match:
        ...
    else:
        ...

..which I thought was an interesting idea, so I wrote a simple implementation of it:

#!/usr/bin/env python2.6
import re

class SRE_Match_Wrapper:
    def __init__(self, match):
        self.match = match

    def __exit__(self, type, value, tb):
        pass

    def __enter__(self):
        return self.match

    def __getattr__(self, name):
        if name == "__exit__":
            return self.__exit__
        elif name == "__enter__":
            return self.__name__
        else:
            return getattr(self.match, name)

def rematch(pattern, inp):
    matcher = re.compile(pattern)
    x = SRE_Match_Wrapper(matcher.match(inp))
    return x
    return match

if __name__ == '__main__':
    # Example:
    with rematch("(\d+)g", "123g") as m:
        if m:
            print(m.group(1))

    with rematch("(\d+)g", "123") as m:
        if m:
            print(m.group(1))

(This functionality could theoretically be patched into the _sre.SRE_Match object)

It would be nice if you could skip the execution of the with statement's code block, if there was no match, which would simplify this to..

with rematch("(\d+)g", "123") as m:
    print(m.group(1)) # only executed if the match occurred

..but this seems impossible based of what I can deduce from PEP 343

Any ideas? As I said, this is really trivial annoyance, almost to the point of being code-golf..

Answer

Glenn Maynard picture Glenn Maynard · Jul 20, 2009

I don't think it's trivial. I don't want to have to sprinkle a redundant conditional around my code if I'm writing code like that often.

This is slightly odd, but you can do this with an iterator:

import re

def rematch(pattern, inp):
    matcher = re.compile(pattern)
    matches = matcher.match(inp)
    if matches:
        yield matches

if __name__ == '__main__':
    for m in rematch("(\d+)g", "123g"):
        print(m.group(1))

The odd thing is that it's using an iterator for something that isn't iterating--it's closer to a conditional, and at first glance it might look like it's going to yield multiple results for each match.

It does seem odd that a context manager can't cause its managed function to be skipped entirely; while that's not explicitly one of the use cases of "with", it seems like a natural extension.