Detecting a repeating cycle in a sequence of numbers (python)

SuperLemon picture SuperLemon · Dec 29, 2011 · Viewed 7.9k times · Source

I was wondering what would be a a fairly 'common' or normal way of doing this. Wasn't really looking for the shortest possible answer like a 2-liner or anything. I've just quickly put this piece of code together but I can't not feel like there's way too much in there. Also if there are any libraries that could help with this, that would be very nice.

def get_cycle(line):
    nums = line.strip().split(' ')

    # 2 main loops, for x and y
    for x in range(2, len(nums)): # (starts at 2, assuming the sequence requires at least 2 members)
        for y in range(0, x):
            # if x is already in numbers before it
            if nums[x] == nums[y]:
                seq = [nums[x]] # (re)start the sequence
                adder = 1       # (re)set the adder to 1
                ok = True       # (re)set ok to be True
                # while the sequence still matches (is ok) and
                # tail of y hasn't reached start of x
                while ok and y + adder < x:
                    if nums[x + adder] == nums[y + adder]:  # if next y and x match
                        seq.append(nums[x + adder])         # add the number to sequence
                        adder += 1                          # increase adder
                    else:
                        ok = False                          # else the sequence is broken
                # if the sequence wasn't broken and has at least 2 members
                if ok and len(seq) > 1:
                    print(' '.join(seq))    # print it out, separated by an empty space
                    return

Answer

Andrew Clark picture Andrew Clark · Dec 29, 2011

I may not be properly understanding this, but I think there is a very simple solution with regex.

(.+ .+)( \1)+

Here is an example:

>>> regex = re.compile(r'(.+ .+)( \1)+')
>>> match = regex.search('3 0 5 5 1 5 1 6 8')
>>> match.group(0)    # entire match
'5 1 5 1'
>>> match.group(1)    # repeating portion
'5 1'
>>> match.start()     # start index of repeating portion
6

>>> match = regex.search('2 0 6 3 1 6 3 1 6 3 1')
>>> match.group(1)
'6 3 1'

Here is how it works, (.+ .+) will match at least two numbers (as many as possible) and place the result into capture group 1. ( \1)+ will match a space followed by the contents of capture group 1, at least once.

And an extended explanation for the string '3 0 5 5 1 5 1 6 8':

  • (.+ .+) will originally match the entire string, but will give up characters off the end because ( \1)+ will fail, this backtracking will occur until (.+ .+) cannot match at the beginning of the string at which point the regex engine will move forward in the string and try again
  • This will happen until the capture group starts at the second 5, it will give up characters at the end until '5 1' is captured, at which point the regex is looking for any number of ' 5 1' for ( \1)+, it will of course find this and the match will succeed