Proof of detecting the start of cycle in linked list

algorithm linked-list

Shamim Hafiz · Oct 17, 2010 · Viewed 24.7k times · Source

From several posts inside stackoverflow and outside, I have come to know how to detect cycles in a linked list, the length of a cycle. I also found the method on how to detect the start of the loop.

Here are the steps again for reference.

Detecting Loop:

Have two pointers, classically called hare and tortoise. Move hare by 2 steps and tortoise by 1. If they meet at some point, then there is surely a cycle and the meeting point is obviously inside the cycle.

Finding length of Loop:

Keep one pointer fixed at meeting point while increment the other until they are same again. Increment a counter as you go along and the counter value at meet will be the length of cycle.

Find the start of cycle

Take one pointer to start of the list and keep the other at the meeting point. Now increment both by one and the meet point is the start of loop. I verified the method using some cases on paper, but I don't understand why it should work.

Can someone, provide a mathematical proof as to why this method works?

Answer

If you represent a list by a pointer to its first node (list)

enter image description here

The algorithm to detect loops is described as follows:

Declare two pointers (pFast) and (pSlow).
Make pSlow and pFast point to list.
Until (pSlow), (pFast) or both point to NULL:
3. If , then STOP as a loop has just been found.
If this point has been reached (one or both two pointers are NULL) then there are no loops in the list.

Lets assume that this algorithm is correct. In this scheme, a loop situation is represented by the following diagram:

enter image description here

Note how every node, except the one pointing to the begining of a loop, is labeled with the number of its target minus one. So, if one node is labeled with i and it is not the begining of a loop, then it is pointed as next element by the node labeled with i-1.

The algorithm explained above can be described as a loop since its main step (3) is a set of sub-steps which repeated until the exit condition is satisfied. That forces us to represent pFast and pSlow in function of the iteration number in the algorithm execution (t).

enter image description here

If the list hadn’t have loops, pointers positions would be described in function of t as:

enter image description here

However there is a node where the loop starts and that function stops describing the evolution of the pointers. Assuming that this pointer is tagged with m, that the loop contains nodes (that is enter image description here and ), and setting t=0 as iteration value when then:

enter image description here

If one pointer is indeed enough to detect loops using the algorithm described, then it must exist at least a solution to the equation enter image description here .