It is universally agreed that a list of n distinct symbols has n! permutations. However, when the symbols are not distinct, the most common convention, in mathematics and elsewhere, seems to be to count only distinct permutations. Thus the permutations of the list [1, 1, 2]
are usually considered to be
[1, 1, 2], [1, 2, 1], [2, 1, 1]
. Indeed, the following C++ code prints precisely those three:
int a[] = {1, 1, 2};
do {
cout<<a[0]<<" "<<a[1]<<" "<<a[2]<<endl;
} while(next_permutation(a,a+3));
On the other hand, Python's itertools.permutations
seems to print something else:
import itertools
for a in itertools.permutations([1, 1, 2]):
print a
This prints
(1, 1, 2)
(1, 2, 1)
(1, 1, 2)
(1, 2, 1)
(2, 1, 1)
(2, 1, 1)
As user Artsiom Rudzenka pointed out in an answer, the Python documentation says so:
Elements are treated as unique based on their position, not on their value.
My question: why was this design decision made?
It seems that following the usual convention would give results that are more useful (and indeed it is usually exactly what I want)... or is there some application of Python's behaviour that I'm missing?
[Or is it some implementation issue? The algorithm as in next_permutation
— for instance explained on StackOverflow here (by me) and shown here to be O(1) amortised — seems efficient and implementable in Python, but is Python doing something even more efficient since it doesn't guarantee lexicographic order based on value? And if so, was the increase in efficiency considered worth it?]
I can't speak for the designer of itertools.permutations
(Raymond Hettinger), but it seems to me that there are a couple of points in favour of the design:
First, if you used a next_permutation
-style approach, then you'd be restricted to passing in objects that support a linear ordering. Whereas itertools.permutations
provides permutations of any kind of object. Imagine how annoying this would be:
>>> list(itertools.permutations([1+2j, 1-2j, 2+j, 2-j]))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: no ordering relation is defined for complex numbers
Second, by not testing for equality on objects, itertools.permutations
avoids paying the cost of calling the __eq__
method in the usual case where it's not necessary.
Basically, itertools.permutations
solves the common case reliably and cheaply. There's certainly an argument to be made that itertools
ought to provide a function that avoids duplicate permutations, but such a function should be in addition to itertools.permutations
, not instead of it. Why not write such a function and submit a patch?