I've been doing some research on replacement selection sort, and I can't find any implementations of it or a good, thorough implementation of replacement selection sort anywhere! Maybe I'm not looking hard enough, but Google is confusing replacement selection sort with selection sort... so this got me wondering:
What are the real differences between selection sort and replacement selection sort?
Where can I find an implementation of replacement selection sort (or a guide to write it)?
What are the characteristics of replacement selection sort that make it more desirable than other sorting algorithms?
Is this algorithm known by any other names?
I have not seen this algorithm described in much detail before, and am basis my analysis on what I have gathered from reading this set of lecture notes.
By my understanding, the key difference between selection sort and replacement selection sort is that selection sort is designed to sort a complete sequence held in main memory, while replacement selection sort is designed to convert an unsorted sequence too large to fit into main memory into a series of "strands" of sorted sequences that can be stored in external memory. These external strands can then be merged together to form the overall sorted sequence. Despite the similarity in their name and in one or two key steps in the operation of the algorithm, they're designed to solve fundamentally different problems.
There are many good tutorials on selection sort available online, so I won't spend too much time discussing it. Intuitively, the algorithm works as follows:
This assumes that the array can be held completely in memory, and if that's the case this algorithm runs in Θ(n2) time. It's not very fast, and for large data sets is not advisable.
This algorithm was described in 1965 by Donald Knuth, so it's designed to work in a very different computing environment than the one we're currently used to. Computers have very little memory (usually, some fixed number of registers), but have access to large external drives. It's common to build algorithms that would load some values into registers, process them there, then flush them directly back to the external storage. (Interestingly, this is similar to how processors work right now, except with main memory instead of external storage).
Let's assume that we have enough space in memory to hold two arrays: a first array Values
of size n that can hold a bunch of values, and a second array Active
of size n that holds boolean values. We're going to try to take a large input stream of unsorted values and do our best job to sort it, given that we only have enough room in memory to hold the Active
and Values
arrays, plus a few extra variables for storage space.
The idea behind the algorithm is as follows. First, load n
values from the external source containing the unsorted sequence directly into the Values
array. Then, set all of the Active
values to true
. For example, if n = 4
, we might have this setup:
Values: 4 1 0 3
Active: Yes Yes Yes Yes
The replacement selection sort algorithm works by repeatedly looking for the lowest value in the Values
array and writing it out to the output stream. In this case, we start off by finding the 0 value and writing it to the stream. This gives
Values: 4 1 3
Active: Yes Yes Yes Yes
Output: 0
Now, we have a blank spot in our Values
array, so we can pull another value in from the external source. Let's suppose that we get 2. In that case, we have this setup:
Values: 4 1 2 3
Active: Yes Yes Yes Yes
Output: 0
Notice that since 2 > 0 and 0 was the smallest element here, we're guaranteed that when we wrote out the 0 to the output, the 2 shouldn't have come before it. That's good. Therefore, we go on to the next step of the algorithm and again find the smallest element here. That's the 1, so we send it to the output device:
Values: 4 2 3
Active: Yes Yes Yes Yes
Output: 0 1
Now, read in another value from the external source:
Values: 4 -1 2 3
Active: Yes Yes Yes Yes
Output: 0 1
Now we're in trouble. This new value (-1) is lower than the 1, meaning that if we truly wanted this value to go into the output in sorted order, it should come before the 1. However, we don't have enough memory to reread from the output device and fix it up. Instead, we're going to do the following. For now, let's keep the -1 in memory. We're going to do our best to sort the remaining elements, but when we're done doing so, we'll do a second iteration of generating a sorted sequence, and will put the -1 into that sequence. In other words, instead of producing one sorted sequence, we're going to produce two.
To indicate in memory that we don't want to write out the -1 yet, we're going to mark the slot holding -1 as inactive. This is shown here:
Values: 4 -1 2 3
Active: Yes NO Yes Yes
Output: 0 1
From now on, we'll just pretend that the -1 isn't here.
Let's continue onward. We now find the smallest value in memory that is still active (2), and write it out to the device:
Values: 4 -1 3
Active: Yes NO Yes Yes
Output: 0 1 2
We now pull the next value from the input device. Let's suppose it's 7:
Values: 4 -1 7 3
Active: Yes NO Yes Yes
Output: 0 1 2
Since 7 > 2, it comes after the 2 in the output, so we don't do anything.
On the next iteration, we find the lowest active value (3) and write it out:
Values: 4 -1 7
Active: Yes NO Yes Yes
Output: 0 1 2 3
We pull the next value from the input device. Let's suppose it's also 3. In this case, we know that since 3 is the smallest value, we can just write it directly to the output stream, since 3 is the smallest of all the values here and we can save ourselves an iteration:
Values: 4 -1 7
Active: Yes NO Yes Yes
Output: 0 1 2 3 3
We now pull the next value from the input device. Suppose it's 2. In that case, as before, we know that the 2 should come before the 3. Just like the earlier -1, this means that we need to hold the 2 in memory for now; we'll write it out later on. Now our setup looks like this:
Values: 4 -1 7 2
Active: Yes NO Yes NO
Output: 0 1 2 3 3
Now, we find the smallest active value (4) and write it to the output device:
Values: -1 7 2
Active: Yes NO Yes NO
Output: 0 1 2 3 3 4
Suppose we now read in a 1 as our next input. We therefore put it into Values
, but mark it inactive:
Values: 1 -1 7 2
Active: NO NO Yes NO
Output: 0 1 2 3 3 4
There's only one active value, which is 7, so we write it out:
Values: 1 -1 2
Active: NO NO Yes NO
Output: 0 1 2 3 3 4 7
Let's suppose we now read a 5. In that case, as before, we store it but mark the slot inactive:
Values: 1 -1 5 2
Active: NO NO NO NO
Output: 0 1 2 3 3 4 7
Notice that all the values are now inactive. This means that we've flushed out from memory all the values that can go into the current output run. Now, we need to go and write out all of the values we were holding for later. To do this, we mark all the values as active, then repeat as before:
Values: 1 -1 5 2
Active: Yes Yes Yes Yes
Output: 0 1 2 3 3 4 7
-1 is the smallest value, so we output it:
Values: 1 5 2
Active: Yes Yes Yes Yes
Output: 0 1 2 3 3 4 7 -1
Suppose that we read a 3. -1 < 3, so we load it into the Values
array.
Values: 1 3 5 2
Active: Yes Yes Yes Yes
Output: 0 1 2 3 3 4 7 -1
1 is the smallest value here, so we remove it:
Values: 3 5 2
Active: Yes Yes Yes Yes
Output: 0 1 2 3 3 4 7 -1 1
Let's suppose that we're now out of input values. We'll mark this slot as done:
Values: --- 3 5 2
Active: Yes Yes Yes Yes
Output: 0 1 2 3 3 4 7 -1 1
2 comes next:
Values: --- 3 5 ---
Active: Yes Yes Yes Yes
Output: 0 1 2 3 3 4 7 -1 1 2
Then 3:
Values: --- --- 5 ---
Active: Yes Yes Yes Yes
Output: 0 1 2 3 3 4 7 -1 1 2 3
Finally, 5:
Values: --- --- --- ---
Active: Yes Yes Yes Yes
Output: 0 1 2 3 3 4 7 -1 1 2 3 5
And we're done! Note that the resulting sequence is not sorted, but it's a lot better than before. It now consists of two strands that are in sorted order. Merging them together (in the same way we'd do a merge for mergesort) will sort the resulting array. This algorithm could potentially have produced far more strands, but since our sample input was small, we only had two.
So how fast is this? Well, each iteration of the loop does a total of at most n comparisons (in memory), one read, and one write. Thus if there are N total values in the stream, the algorithm does O(nN) comparisons and O(N) memory operations. If memory operations are expensive, this isn't too bad, though you'll need a second pass at the end to merge everything back together.
In pseudocode, the algorithm looks like this:
Make Values an array of n elements.
Make Active an array of n booleans, all initially true.
Read n values from memory into Values.
Until no values are left to process:
Find the smallest value that is still active.
Write it to the output device.
Read from the input device into the slot where the old element was.
If it was smaller than the old element, mark the old slot inactive.
If all slots are inactive, mark them all active.
I would be shocked if there was any reason to code this algorithm up now. This made sense many decades ago when memory was really, really small. Nowadays, there are far better external sorting algorithms available, and they'd almost certainly outperform this algorithm.
Hope this helps!