Find longest increasing sequence

pappu picture pappu · Feb 8, 2011 · Viewed 25.1k times · Source

You are given a sequence of numbers and you need to find a longest increasing subsequence from the given input(not necessary continuous).

I found the link to this(Longest increasing subsequence on Wikipedia) but need more explanation.

If anyone could help me understand the O(n log n) implementation, that will be really helpful. If you could explain the algo with an example, that will be really appreciated.

I saw the other posts as well and what I did not understand is: L = 0 for i = 1, 2, ... n: binary search for the largest positive j ≤ L such that X[M[j]] < X[i] (or set j = 0 if no such value exists) above statement, from where to start binary search? how to initialize M[], X[]?

Answer

fgb picture fgb · Feb 11, 2011

A simpler problem is to find the length of the longest increasing subsequence. You can focus on understanding that problem first. The only difference in the algorithm is that it doesn't use the P array.

x is the input of a sequence, so it can be initialized as: x = [0, 8, 4, 12, 2, 10, 6, 14, 1, 9, 5, 13, 3, 11, 7, 15]

m keeps track of the best subsequence of each length found so far. The best is the one with the smallest ending value (allowing a wider range of values to be added after it). The length and ending value is the only data needed to be stored for each subsequence.

Each element of m represents a subsequence. For m[j],

  • j is the length of the subsequence.
  • m[j] is the index (in x) of the last element of the subsequence.
  • so, x[m[j]] is the value of the last element of the subsequence.

L is the length of the longest subsequence found so far. The first L values of m are valid, the rest are uninitialized. m can start with the first element being 0, the rest uninitialized. L increases as the algorithm runs, and so does the number of initialized values of m.

Here's an example run. x[i], and m at the end of each iteration is given (but values of the sequence are used instead of indexes).

The search in each iteration is looking for where to place x[i]. It should be as far to the right as possible (to get the longest sequence), and be greater than the value to its left (so it's an increasing sequence).

 0:  m = [0, 0]        - ([0] is a subsequence of length 1.)
 8:  m = [0, 0, 8]     - (8 can be added after [0] to get a sequence of length 2.)
 4:  m = [0, 0, 4]     - (4 is better than 8. This can be added after [0] instead.)
 12: m = [0, 0, 4, 12] - (12 can be added after [...4])
 2:  m = [0, 0, 2, 12] - (2 can be added after [0] instead of 4.)
 10: m = [0, 0, 2, 10]
 6:  m = [0, 0, 2, 6]
 14: m = [0, 0, 2, 6, 14]
 1:  m = [0, 0, 1, 6, 14]
 9:  m = [0, 0, 1, 6, 9]
 5:  m = [0, 0, 1, 5, 9]
 13: m = [0, 0, 1, 5, 9, 13]
 3:  m = [0, 0, 1, 3, 9, 13]
 11: m = [0, 0, 1, 3, 9, 11]
 7:  m = [0, 0, 1, 3, 7, 11]
 15: m = [0, 0, 1, 3, 7, 11, 15]

Now we know there is a subsequence of length 6, ending in 15. The actual values in the subsequence can be found by storing them in the P array during the loop.

Retrieving the best sub-sequence:

P stores the previous element in the longest subsequence (as an index of x), for each number, and is updated as the algorithm advances. For example, when we process 8, we know it comes after 0, so store the fact that 8 is after 0 in P. You can work backwards from the last number like a linked-list to get the whole sequence.

So for each number we know the number that came before it. To find the subsequence ending in 7, we look at P and see that:

7 is after 3
3 is after 1
1 is after 0

So we have the subsequence [0, 1, 3, 7].

The subsequences ending in 7 or 15 share some numbers:

15 is after 11
11 is after 9
9 is after 6
6 is after 2
2 is after 0

So we have the subsequences [0, 2, 6, 9, 11], and [0, 2, 6, 9, 11, 15] (the longest increasing subsequence)