Understanding a median selection algorithm?

anon picture anon · Mar 5, 2012 · Viewed 7.5k times · Source

I'm currently learning algorithms in my spare time but have the following question while studying chapter 3 select() algorithms.

I understand that I can use the select() algorithm to find the median number (n/2 th smallest number) if I was using a array from A to n numbers.

1) but this is the bit I'm struggling to understand. A = [3, 7, 5, 1, 4, 2, 6, 2]. suppose that is the array. what is contents of the array after each call to Partition(), and the parameters in each recursive call of Select().

can some one explain how they are working this out please?

below is the pseudo-code for the 2 algorithms.

Select(A, p, r, k) {
    /* return k-th smallest number in A[p..r] */
    if (p==r) return A[p] /* base case */
    q := Partition(A,p,r)
    len := q – p + 1
    if (k == len) return A[q]
    else if (k<len) return Select(A,p,q-1,k)
    else return Select(A,q+1,r,k-len)
}

and the second code is

Partition(A, p, r) { /* partition A[p..r] */
    x := A[r] /* pivot */
    i := p-1
    for j := p to r-1 {
        if (A[j] <= x) {
            i++
            swap(A[i], A[j])
        }
    }
    swap(A[i+1], A[r])
    return i+1
}

The book I am using is called The Derivation of Algorithms by Anne Kaldewaij.

Answer

templatetypedef picture templatetypedef · Mar 5, 2012

This algorithm works in two steps. The partitioning step works by picking some pivot element, then rearranging the elements of the array such that everything less than the pivot is to one side, everything greater than the pivot is to the other side, and the pivot is in the correct place. For example, given the array

3  2  5  1  4

If we pick a pivot of 3, then we might partition the array like this:

2  1  3  5  4
+--+  ^  +--+
 ^    |    ^
 |    |    +--- Elements greater than 3
 |    +-------- 3, in the right place
 +------------- Elements less than 3

Notice that we haven't sorted the array; we've just made it closer to being sorted. This is, incidentally, the first step in quicksort.

The algorithm then uses the following logic. Suppose that we want to find the element that belongs at index k in sorted order (the kth smallest element). Then, in relation to the pivot we picked, there are three options:

  1. The pivot is at position k. Then, since the pivot is in the right place, the value we're looking for must be the pivot. We're done.
  2. The pivot is at position greater than k. Then the kth smallest element must be in the portion of the array before the pivot, so we can recursively search that portion of the array for the kth smallest element.
  3. The pivot is at position smaller than k. Then the kth smallest element must be somewhere in the upper region of the array, and we can recurse there.

In our case, suppose that we want the second-smallest element (the one at position 2). Since the pivot ended up at position 3, this means that the second-smallest element must be somewhere in the first half of the array, so we would recurse on the subarray

2  1

If we wanted the actual median element, since the pivot ended up smack in the middle of the array, we would just output that the median is 3 and be done.

Finally, if we wanted something like the fourth-smallest element, then since the pivot is before position 4, we would recurse on the upper half of the array, namely

5  4

and would look for the first smallest element here, since there are three elements before this region.

The rest of the algorithm are the details of how to do the partitioning step (which is probably the most involved part of the algorithm) and how to do the three-way choice about whether to recurse or not (a bit less difficult). Hopefully, though, this high-level structure helps the algorithm make more sense.

Hope this helps!