What is the right approach when using STL container for median calculation?

sharkin picture sharkin · Nov 12, 2009 · Viewed 32.9k times · Source

Let's say I need to retrieve the median from a sequence of 1000000 random numeric values.

If using anything but std::list, I have no (built-in) way to sort sequence for median calculation.

If using std::list, I can't randomly access values to retrieve middle (median) of sorted sequence.

Is it better to implement sorting myself and go with e.g. std::vector, or is it better to use std::list and use std::list::iterator to for-loop-walk to the median value? The latter seems less overheadish, but also feels more ugly..

Or are there more and better alternatives for me?

Answer

Mike Seymour picture Mike Seymour · Nov 12, 2009

Any random-access container (like std::vector) can be sorted with the standard std::sort algorithm, available in the <algorithm> header.

For finding the median, it would be quicker to use std::nth_element; this does enough of a sort to put one chosen element in the correct position, but doesn't completely sort the container. So you could find the median like this:

int median(vector<int> &v)
{
    size_t n = v.size() / 2;
    nth_element(v.begin(), v.begin()+n, v.end());
    return v[n];
}