O(n) algorithm to find the median of a collection of numbers

ejf071189 picture ejf071189 · Nov 17, 2010 · Viewed 52.5k times · Source

Problem: input is a (not necessarily sorted) sequence S = k1, k2, ..., kn of n arbitrary numbers. Consider the collection C of n² numbers of the form min{ki,kj}, for 1 <=i, j<=n. Present an O(n) time and O(n) space algorithm to find the median of C.

So far I've found by examining C for different sets S that the number of instances of the smallest number in S in C is equal to (2n-1), the next smallest number: (2n-3) and so on until you only have one instance of the largest number.

Is there a way to use this information to find the median of C?

Answer

Jerry Coffin picture Jerry Coffin · Nov 17, 2010

There are a number of possibilities. One I like is Hoare's Select algorithm. The basic idea is similar to a Quicksort, except that when you recurse, you only recurse into the partition that will hold the number(s) you're looking for.

For example, if you want the median of 100 numbers, you'd start by partitioning the array, just like in Quicksort. You'd get two partitions -- one of which contains the 50th element. Recursively carry out your selection in that partition. Continue until your partition contains only one element, which will be the median (and note that you can do the same for another element of your choice).