return the second instance of a regex search in a line

captain yossarian picture captain yossarian · Sep 18, 2014 · Viewed 10.3k times · Source

i have a file that has a specific line of interest (say, line 12) that looks like this:

conform: 244216 (packets) exceed: 267093 (packets)

i've written a script to pull the first number via regex and dump the value into a new file:

getexceeds = open("file1.txt", "r").readlines()[12]
output = re.search(r"\d+", getexceeds).group(0)

with open("file2.txt", "w") as outp:
    outp.write(output)

i am not quite good enough yet to return the second number in that line into a new file -- can anyone suggest a way?

thanks as always for any help!

Answer

FrobberOfBits picture FrobberOfBits · Sep 18, 2014

You've got it almost all right; your regex is only looking for the first match though.

match = re.search(r"(\d+).*?(\d+)", getexceeds)
firstNumber = match.group(1)
secondNumber = match.group(2)

Notice that the regex is looking for two capturing groups (in parens) both a sequence of digits. What's between is just anything - .*? means some minimal number of any characters.

Here's a little test I ran from the shell:

>>> str = 'conform: 244216 (packets) exceed: 267093 (packets)'
>>> match = re.search(r"(\d+).*?(\d+)", str)
>>> print match.group(1)
244216
>>> print match.group(2)
267093