How to tell if an array is a permutation in O(n)?

INS picture INS · May 20, 2010 · Viewed 15.6k times · Source

Input: A read-only array of N elements containing integer values from 1 to N (some integer values can appear more than once!). And a memory zone of a fixed size (10, 100, 1000 etc - not depending on N).

How to tell in O(n) if the array represents a permutation?

--What I achieved so far (an answer proved that this was not good):--

  1. I use the limited memory area to store the sum and the product of the array.
  2. I compare the sum with N*(N+1)/2 and the product with N!

I know that if condition (2) is true I might have a permutation. I'm wondering if there's a way to prove that condition (2) is sufficient to tell if I have a permutation. So far I haven't figured this out ...

Answer

Jason S picture Jason S · May 20, 2010

I'm very slightly skeptical that there is a solution. Your problem seems to be very close to one posed several years ago in the mathematical literature, with a summary given here ("The Duplicate Detection Problem", S. Kamal Abdali, 2003) that uses cycle-detection -- the idea being the following:

If there is a duplicate, there exists a number j between 1 and N such that the following would lead to an infinite loop:

x := j;
do
{
   x := a[x];
}
while (x != j);

because a permutation consists of one or more subsets S of distinct elements s0, s1, ... sk-1 where sj = a[sj-1] for all j between 1 and k-1, and s0 = a[sk-1], so all elements are involved in cycles -- one of the duplicates would not be part of such a subset.

e.g. if the array = [2, 1, 4, 6, 8, 7, 9, 3, 8]

then the element in bold at position 5 is a duplicate because all the other elements form cycles: { 2 -> 1, 4 -> 6 -> 7 -> 9 -> 8 -> 3}. Whereas the arrays [2, 1, 4, 6, 5, 7, 9, 3, 8] and [2, 1, 4, 6, 3, 7, 9, 5, 8] are valid permutations (with cycles { 2 -> 1, 4 -> 6 -> 7 -> 9 -> 8 -> 3, 5 } and { 2 -> 1, 4 -> 6 -> 7 -> 9 -> 8 -> 5 -> 3 } respectively).

Abdali goes into a way of finding duplicates. Basically the following algorithm (using Floyd's cycle-finding algorithm) works if you happen across one of the duplicates in question:

function is_duplicate(a, N, j)
{
     /* assume we've already scanned the array to make sure all elements
        are integers between 1 and N */
     x1 := j;
     x2 := j;
     do
     {             
         x1 := a[x1];
         x2 := a[x2];
         x2 := a[x2];
     } while (x1 != x2);

     /* stops when it finds a cycle; x2 has gone around it twice, 
        x1 has gone around it once.
        If j is part of that cycle, both will be equal to j. */
     return (x1 != j);
}

The difficulty is I'm not sure your problem as stated matches the one in his paper, and I'm also not sure if the method he describes runs in O(N) or uses a fixed amount of space. A potential counterexample is the following array:

[3, 4, 5, 6, 7, 8, 9, 10, ... N-10, N-9, N-8, N-7, N-2, N-5, N-5, N-3, N-5, N-1, N, 1, 2]

which is basically the identity permutation shifted by 2, with the elements [N-6, N-4, and N-2] replaced by [N-2, N-5, N-5]. This has the correct sum (not the correct product, but I reject taking the product as a possible detection method since the space requirements for computing N! with arbitrary precision arithmetic are O(N) which violates the spirit of the "fixed memory space" requirement), and if you try to find cycles, you will get cycles { 3 -> 5 -> 7 -> 9 -> ... N-7 -> N-5 -> N-1 } and { 4 -> 6 -> 8 -> ... N-10 -> N-8 -> N-2 -> N -> 2}. The problem is that there could be up to N cycles, (identity permutation has N cycles) each taking up to O(N) to find a duplicate, and you have to keep track somehow of which cycles have been traced and which have not. I'm skeptical that it is possible to do this in a fixed amount of space. But maybe it is.

This is a heavy enough problem that it's worth asking on mathoverflow.net (despite the fact that most of the time mathoverflow.net is cited on stackoverflow it's for problems which are too easy)


edit: I did ask on mathoverflow, there's some interesting discussion there.