i'm trying to get the start and end positions of a query in sequence by using re.findall
import re
sequence = 'aaabbbaaacccdddeeefff'
query = 'aaa'
findall = re.findall(query,sequence)
>>> ['aaa','aaa']
how do i get something like findall.start() or findall.end() ?
i would like to get
start = [0,6]
end = [2,8]
i know that
search = re.search(query,sequence)
print search.start(),search.end()
>>> 0,2
would give me only the first instance
Use re.finditer
:
>>> import re
>>> sequence = 'aaabbbaaacccdddeeefff'
>>> query = 'aaa'
>>> r = re.compile(query)
>>> [[m.start(),m.end()] for m in r.finditer(sequence)]
[[0, 3], [6, 9]]
From the docs:
Return an
iterator
yieldingMatchObject
instances over all non-overlapping matches for the RE pattern in string. The string is scanned left-to-right, and matches are returned in the order found.