Askhole
Askhole

Reputation: 23

Sorting by any number of criteria in Python

Say I have a table of data, and I want to be able to return data from the table sorted by some criteria (like SQL). The problem is, I don't know how many things I need to order by, and the ORDER BY command could be followed by just one column name, or two, or 100.

I've seen other answers that do this:

s = sorted(s, key = lambda x: (x[1], x[2]))

...but the tuple argument is hard-coded, not created at runtime. I want to be able to do something like this:

# Build list of columns to sort by, in ascending order of priority
orderings = [0, 2, ...]
s = sorted(s, key = lambda x: orderings)

Is that possible? What other options do I have?

Upvotes: 2

Views: 448

Answers (4)

rileymcdowell
rileymcdowell

Reputation: 599

I'll answer your question with pure python, then tell you how to solve the problem with a library. You can proceed depending on which better suits what you're trying to do.


Pure Python

The problem here is that you aren't sure which columns you want to sort by when you're writing the code, but you still need to create a tuple to sort by. That's what the (x[1], x[2]) above is doing. It's selecting the second and third columns (index 1 and 2) as the columns to sort on. You need a way to do that without hard-coding the integers 1 and 2 into the code.

Lets say you have a list of lists called s and you want to sort by some subset of the columns in those lists.

s = < a list of lists >
orderings = [ 1, 2 ] # Could come from user input, for example.
s = sorted(s, key = lambda elem: tuple(map(elem.__getitem__, orderings)))

It turns out the indexing in python is actually syntactic sugar for calling the __getitem__ magic method. By mapping __getitem__ over every index in orderings you can create a list of keys to sort on. You can then turn them into tuples on the fly using the tuple constructor. This happens once per row of s, essentially selecting out sorting keys. That's exactly what the sorted function is looking for.


Library Solution

In my opinion, sorting data this way is great for one-off work, but it's difficult to read. In your question, you're supposing that you have a table of data in python and you want to do some sorting on it. The best way to handle that is to use an appropriate library for dealing with tabular data. I suggest the pandas dataframe library. Let's suppose your data is already in a pandas dataframe called df with columns called first, second, and third. Let's also suppose that you want to sort by first ascending, then by third descending.

df.sort_values(by=['first', 'third'], ascending=[True, False])

That's it. This function returns a new dataframe sorted by first, then third, in ascending and descending order, respectively. All you need to know to do this is the names of your columns and their sort directions. It's significantly cleaner than dealing with tuples and indices. The downside is that the pandas library has a lot of dependencies that can be difficult to install.

Upvotes: 0

MSeifert
MSeifert

Reputation: 152657

This makes mostly sense with dictionaries but the approach is similar to @wwii's answer (I'm using keys instead of columns):

results = [{'name': 'Peter', 'score': 10, 'match': 0},
           {'name': 'Wendy', 'score': 2, 'match': 1},
           {'name': 'Hook', 'score': 1000, 'match': 0}]

from operator import itemgetter

orderby = ['match']  # define the keys by which to sort

sorted(results, key=itemgetter(*orderby))

gives:

[{'match': 0, 'name': 'Peter', 'score': 10},
 {'match': 0, 'name': 'Hook', 'score': 1000},
 {'match': 1, 'name': 'Wendy', 'score': 2}]

or:

orderby = ['match', 'name']

sorted(results, key=itemgetter(*orderby))

which gives:

[{'match': 0, 'name': 'Hook', 'score': 1000},
 {'match': 0, 'name': 'Peter', 'score': 10},
 {'match': 1, 'name': 'Wendy', 'score': 2}]

Upvotes: 1

Mark Ransom
Mark Ransom

Reputation: 308206

A simple way would be similar to what you already have:

s = sorted(s, key = lambda x: [x[i] for i in orderings])

Otherwise you can simply sort multiple times. Python sorts are stable, which means any elements that compare equal will keep their original order. By sorting multiple times from the least significant to the most significant key, you'll find the end result to be exactly what you need.

Upvotes: 4

wwii
wwii

Reputation: 23753

Use operator.itemgetter for the key function.

>>> import operator
>>> items = [1, 2, 4]
>>> key = operator.itemgetter(*items)
>>> key
operator.itemgetter(1, 2, 4)
>>> a = ['kljdfii', 'lkjfo', 'lklvjo']
>>> sorted(a, key = key)
['lkjfo', 'lklvjo', 'kljdfii']
>>> 

Upvotes: 3

Related Questions