Stefan Falk
Stefan Falk

Reputation: 25387

Why is my list not sorted as expected?

I have a dict() called twitter_users which holds TwitterUser objects as values. I want those objects to be sorted by the field mentioned. However, using sorted() does not work as I expect. I provide a lambda function that is supposed to determine if user a or user b was mentioned more often.

srt = sorted(twitter_users.values(), 
         cmp=(lambda a,b: 
              True if a.mentioned > b.mentioned else False))

for s in srt:
    print s.mentioned

Unfortunately this is not working and the list srt isn't sorted in any way.

How can I make this work?

Upvotes: 3

Views: 367

Answers (1)

Martijn Pieters
Martijn Pieters

Reputation: 1121584

A cmp function should return an integer, 0 when equal, 1 or higher when a should come after b and -1 or lower if they should come in the opposite order.

You instead return False and True. Because the Python boolean type is a subclass of int, these objects have the values 0 and 1 when interpreted as integers. You never return -1, so you are confusing the sorting algorithm; you tell it the order of a and b is either always 'equal' or a should come before b, always. But the sorting algorithm sometimes asks for a and b swapped, in which case you gave it conflicting information!

Note that your expression is rather verbose; True if a.mentioned > b.mentioned else False could just be simplified to a.mentioned > b.mentioned; the > operator already produces either True or False. Using simple integers you can see that that is not going to produce expected results:

>>> sorted([4, 2, 5, 3, 8], cmp=lambda a, b: a > b)
[4, 2, 5, 3, 8]

while actually returning -1, 0, or 1 does work:

>>> sorted([4, 2, 5, 3, 8], cmp=lambda a, b: 1 if a > b else 0 if a == b else -1)
[2, 3, 4, 5, 8]

or instead of such a verbose expression, just use the built-in cmp() function; for your case you'd use that like this:

srt = sorted(twitter_users.values(), cmp=lambda a, b: cmp(a.mentioned, b.mentioned)) 

But you shouldn't really use cmp at all; there is a far simpler (and more efficient) option. Just use the key function instead, which simply returns the mentioned attribute:

srt = sorted(twitter_users.values(), key=lambda v: v.mentioned) 

The key function produces values by which the actual sort takes place; this function is used to produce a Schwartzian transform. Such a transform is more efficient because it is only called O(n) times, while the cmp function is called O(n log n) times.

Because you are only accessing an attribute, instead of a lambda you can use a operator.attrgetter() object to do the attribute fetching for you:

from operator import attrgetter

srt = sorted(twitter_users.values(), key=attrgetter('mentioned')) 

Upvotes: 10

Related Questions