Random playlist algorithm

Alon Gubkin picture Alon Gubkin · Nov 29, 2009 · Viewed 7.7k times · Source

I need to create a list of numbers from a range (for example from x to y) in a random order so that every order has an equal chance.

I need this for a music player I write in C#, to create play lists in a random order.

Any ideas?

Thanks.

EDIT: I'm not interested in changing the original list, just pick up random indexes from a range in a random order so that every order has an equal chance.

Here's what I've wrriten so far:

    public static IEnumerable<int> RandomIndexes(int count)
    {
        if (count > 0)
        {
            int[] indexes = new int[count];
            int indexesCountMinus1 = count - 1;

            for (int i = 0; i < count; i++)
            {
                indexes[i] = i;
            }

            Random random = new Random();

            while (indexesCountMinus1 > 0)
            {
                int currIndex = random.Next(0, indexesCountMinus1 + 1);
                yield return indexes[currIndex];

                indexes[currIndex] = indexes[indexesCountMinus1];
                indexesCountMinus1--;
            }

            yield return indexes[0];
        }
    }

It's working, but the only problem of this is that I need to allocate an array in the memory in the size of count. I'm looking for something that dose not require memory allocation.

Thanks.

Answer

Ryan Emerle picture Ryan Emerle · Nov 29, 2009

This can actually be tricky if you're not careful (i.e., using a naïve shuffling algorithm). Take a look at the Fisher-Yates/Knuth shuffle algorithm for proper distribution of values.

Once you have the shuffling algorithm, the rest should be easy.

Here's more detail from Jeff Atwood.

Lastly, here's Jon Skeet's implementation and description.

EDIT

I don't believe that there's a solution that satisfies your two conflicting requirements (first, to be random with no repeats and second to not allocate any additional memory). I believe you may be prematurely optimizing your solution as the memory implications should be negligible, unless you're embedded. Or, perhaps I'm just not smart enough to come up with an answer.

With that, here's code that will create an array of evenly distributed random indexes using the Knuth-Fisher-Yates algorithm (with a slight modification). You can cache the resulting array, or perform any number of optimizations depending on the rest of your implementation.

  private static int[] BuildShuffledIndexArray( int size ) {

     int[] array = new int[size];
     Random rand = new Random();
     for ( int currentIndex = array.Length - 1; currentIndex > 0; currentIndex-- ) {
        int nextIndex = rand.Next( currentIndex + 1 );
        Swap( array, currentIndex, nextIndex );
     }
     return array;
  }

  private static void Swap( IList<int> array, int firstIndex, int secondIndex ) {

     if ( array[firstIndex] == 0 ) {
        array[firstIndex] = firstIndex;
     }
     if ( array[secondIndex] == 0 ) {
        array[secondIndex] = secondIndex;
     }
     int temp = array[secondIndex];
     array[secondIndex] = array[firstIndex];
     array[firstIndex] = temp;
  }

NOTE: You can use ushort instead of int to half the size in memory as long as you don't have more than 65,535 items in your playlist. You could always programmatically switch to int if the size exceeds ushort.MaxValue. If I, personally, added more than 65K items to a playlist, I wouldn't be shocked by increased memory utilization.

Remember, too, that this is a managed language. The VM will always reserve more memory than you are using to limit the number of times it needs to ask the OS for more RAM and to limit fragmentation.

EDIT

Okay, last try: we can look to tweak the performance/memory trade off: You could create your list of integers, then write it to disk. Then just keep a pointer to the offset in the file. Then every time you need a new number, you just have disk I/O to deal with. Perhaps you can find some balance here, and just read N-sized blocks of data into memory where N is some number you're comfortable with.

Seems like a lot of work for a shuffle algorithm, but if you're dead-set on conserving memory, then at least it's an option.