From several posts inside stackoverflow and outside, I have come to know how to detect cycles in a linked list, the length of a cycle. I also found the method on how to detect the start of the loop.
Here are the steps again for reference.
Detecting Loop:
Have two pointers, classically called hare and tortoise. Move hare by 2 steps and tortoise by 1. If they meet at some point, then there is surely a cycle and the meeting point is obviously inside the cycle.
Finding length of Loop:
Keep one pointer fixed at meeting point while increment the other until they are same again. Increment a counter as you go along and the counter value at meet will be the length of cycle.
Find the start of cycle
Take one pointer to start of the list and keep the other at the meeting point. Now increment both by one and the meet point is the start of loop. I verified the method using some cases on paper, but I don't understand why it should work.
Can someone, provide a mathematical proof as to why this method works?
If you represent a list by a pointer to its first node (list)
The algorithm to detect loops is described as follows:
Lets assume that this algorithm is correct. In this scheme, a loop situation is represented by the following diagram:
Note how every node, except the one pointing to the begining of a loop, is labeled with the number of its target minus one. So, if one node is labeled with i and it is not the begining of a loop, then it is pointed as next element by the node labeled with i-1.
The algorithm explained above can be described as a loop since its main step (3) is a set of sub-steps which repeated until the exit condition is satisfied. That forces us to represent pFast and pSlow in function of the iteration number in the algorithm execution (t).
If the list hadn’t have loops, pointers positions would be described in function of t as:
However there is a node where the loop starts and that function stops describing the evolution of the pointers. Assuming that this pointer is tagged with m, that the loop contains nodes (that is and ), and setting t=0 as iteration value when then:
If one pointer is indeed enough to detect loops using the algorithm described, then it must exist at least a solution to the equation .
This equation is true if and only if there is a value for t that makes:
This ends in a function, which generates values of t that are valid solutions to the equation described above:
Thus It is proved that one slow pointer and one fast pointer are enough to detect loop conditions in a linked list.