Sorting in linear time and in place

Roronoa picture Roronoa · Mar 28, 2013 · Viewed 8k times · Source

Suppose that n records have keys in the range from 1 to k.

  • Write an algorithm to sort the records in place in O(n+k) time.
  • You may use O(k) storage outside the input array.
  • Is your algorithm stable?

if we use counting sort to we can do it in O(n+k) time and is stable but its not in place.
if k=2 it can be done in place but its not stable (using two variables to maintain the indexes in the array for k=0 and k=1)
but for k>2 i couldnt think of any good algo

Answer

nwellnhof picture nwellnhof · Mar 30, 2013

First, let's rehash how counting sort works:

  • Count how often every key exists in the array to be sorted. These counts are written to an array of size k.
  • Compute the partial sums of the key counts. This gives the starting position for every bin of equal keys in the sorted array.
  • Move the items in the array to their final position incrementing the starting position of the corresponding bin for every item.

Now the question is how to perform the final step in-place. The standard approach for an in-place permutation is to select the first element and swap it with the element that takes its correct position. This step is repeated with the swapped element until we hit a element that belongs in the first position (a cycle has been completed). Then the whole procedure is repeated for the elements at the second, third, etc. position until the whole array has been processed.

The problem with counting sort is that the final positions are not readily available but are computed by incrementing every bin's starting position in the final loop. In order to never increment the starting position twice for an element, we have to find a way to determine if an element at a certain position has been moved there already. This can be done by keeping track of the original starting position for every bin. If an element lies between the original starting position and the position for the next element of a bin, it has been already touched.

Here's an implementation in C99 that runs in O(n+k) and requires only two arrays of size k as extra storage. The final permutation step is not stable.

#include <stdlib.h>

void in_place_counting_sort(int *a, int n, int k)
{
    int *start = (int *)calloc(k + 1, sizeof(int));
    int *end   = (int *)malloc(k * sizeof(int));

    // Count.
    for (int i = 0; i < n; ++i) {
        ++start[a[i]];
    }

    // Compute partial sums.
    for (int bin = 0, sum = 0; bin < k; ++bin) {
        int tmp = start[bin];
        start[bin] = sum;
        end[bin]   = sum;
        sum += tmp;
    }
    start[k] = n;

    // Move elements.
    for (int i = 0, cur_bin = 0; i < n; ++i) {
        while (i >= start[cur_bin+1]) { ++cur_bin; }
        if (i < end[cur_bin]) {
            // Element has already been processed.
            continue;
        }

        int bin = a[i];
        while (bin != cur_bin) {
            int j = end[bin]++;
            // Swap bin and a[j]
            int tmp = a[j];
            a[j] = bin;
            bin = tmp;
        }
        a[i] = bin;
        ++end[cur_bin];
    }

    free(start);
    free(end);
}

Edit: Here's another version using only a single array of size k based on Mohit Bhura's approach.

#include <stdlib.h>

void in_place_counting_sort(int *a, int n, int k)
{
    int *counts = (int *)calloc(k, sizeof(int));

    // Count.
    for (int i = 0; i < n; ++i) {
        ++counts[a[i]];
    }

    // Compute partial sums.
    for (int val = 0, sum = 0; val < k; ++val) {
        int tmp = counts[val];
        counts[val] = sum;
        sum += tmp;
    }

    // Move elements.
    for (int i = n - 1; i >= 0; --i) {
        int val = a[i];
        int j   = counts[val];

        if (j < i) {
            // Process a fresh cycle. Since the index 'i' moves
            // downward and the counts move upward, it is
            // guaranteed that a value is never moved twice.

            do {
                ++counts[val];

                // Swap val and a[j].
                int tmp = val;
                val  = a[j];
                a[j] = tmp;

                j = counts[val];
            } while (j < i);

            // Move final value into place.
            a[i] = val;
        }
    }

    free(counts);
}