Reversible shuffle algorithm using a key

Tush picture Tush · Aug 22, 2010 · Viewed 7.3k times · Source

How would I code a reversible shuffle algorithm in C# which uses a key to shuffle and can be reversed to the original state?

For instance, I have a string: "Hello world", how can I shuffle it so that later I could be able to reverse the shuffled string back to "Hello world".

Answer

Steve Jessop picture Steve Jessop · Aug 22, 2010

Look at Fisher-Yates shuffle for a way to permute the string based on a key. Feed the key as the seed into a PRNG, use that to generate the random numbers used by the shuffle.

Now, how to reverse the process? Fisher-Yates works by swapping certain pairs of elements. So to reverse the process you can feed the same key into the same PRNG, then run through the Fisher-Yates algorithm as if you were shuffling an array the size of your string. But actually you don't move anything, just record the indexes of the elements that would be swapped at each stage.

Once you've done this, run through your list of swaps in reverse, applying them to your shuffled string. The result is the original string.

So for example, suppose we've shuffled the string "hello" using the following swaps (I haven't used a PRNG here, I rolled dice, but the point about a PRNG is it gives you the same sequence of numbers given the same seed):

(4,0): "hello" -> "oellh"
(3,3): "oellh" -> "oellh"
(2,1): "oellh" -> "olelh"
(1,0): "olelh" -> "loelh"

So, the shuffled string is "loelh".

To deshuffle, I generate the same series of "random" numbers, 0, 3, 1, 0. Then apply the swaps in reverse order:

(1,0): "loelh" -> "olelh"
(2,1): "olelh" -> "oellh"
(3,3): "oellh" -> "oellh"
(4,0): "oellh" -> "hello"

Success!

The downside of this of course is that it uses a lot of memory for the deshuffle: an array of indexes as long as your original array of chars. So for truly huge arrays, you might want to choose a PRNG (or anyway a sequence-generation function) that can be stepped either forwards or backwards without having to store all the output. This rules out hash-based cryptographically secure PRNGs, but LFSRs are reversible.

Btw, why do you want to do this?