sort() in Python using cmp

yoppy picture yoppy · Dec 8, 2015 · Viewed 20k times · Source

I am trying to sort a list, move all 0 to the end of list. example: [0,1,0,2,3,0,4]->[1,2,3,4,0,0,0]

and I see someone code it in 1 line

list.sort(cmp=lambda a,b:-1 if b==0 else 0)

But I don't understand what inside the parentheses mean.

Could anyone tell me? Thank you.

Answer

kay picture kay · Dec 8, 2015

Preface:

Sort a list according to the normal comparison:

some_list.sort()   

Supply a custom comparator:

some_list.sort(cmp=my_comparator)

A lambda function:

x = lambda a, b: a - b
# is roughly the same as
def x(a, b):
    return a - b

An if-else-expression:

value = truthy_case if condition else otherwise
# is roughly the same as
if condition:
    value = truthy_case
else:
    value = otherwise

The line list.sort(cmp=lambda a,b:-1 if b==0 else 0) itself:

Now, the condition in the comparator is whether b==0, if so indicate that b has a bigger value than a (the sign of the result is negative), otherwise indicate that the values compare the same (the sign is zero).

Whilst Python's list.sort() is stable, this code is not sane, because the comparator needs to test a, too, not only b. A proper implementation would use the key argument:

some_list.sort(key=lambda a: 0 if a == 0 else -1)

Fixed list.sort(cmp=...) implementation:

If you want to use list.sort(cmp=...) (you don't) or if you are just curious, this is a sane implementation:

some_list.sort(cmp=lambda a, b: 0 if a == b else
                               +1 if a == 0 else
                               -1 if b == 0 else 0)

But notice:

In Py3.0, the cmp parameter was removed entirely (as part of a larger effort to simplify and unify the language, eliminating the conflict between rich comparisons and the __cmp__ methods).

An alternative:

Sorting a list is in O(𝘯 log 𝘯). I do not know if for this simple problem the code runs faster, but I wouldn't think so. An O(𝘯) solution is filtering:

new_list = [x for x in some_list if x != 0]
new_list.extend([0] * (len(some_list) - len(new_list)))

The difference will probably only matter for quite long lists, though.