Reputation: 23
Say I have a table of data, and I want to be able to return data from the table sorted by some criteria (like SQL). The problem is, I don't know how many things I need to order by, and the ORDER BY command could be followed by just one column name, or two, or 100.
I've seen other answers that do this:
s = sorted(s, key = lambda x: (x[1], x[2]))
...but the tuple argument is hard-coded, not created at runtime. I want to be able to do something like this:
# Build list of columns to sort by, in ascending order of priority
orderings = [0, 2, ...]
s = sorted(s, key = lambda x: orderings)
Is that possible? What other options do I have?
Upvotes: 2
Views: 448
Reputation: 599
I'll answer your question with pure python, then tell you how to solve the problem with a library. You can proceed depending on which better suits what you're trying to do.
The problem here is that you aren't sure which columns you want to sort by when you're writing the code, but you still need to create a tuple to sort by. That's what the (x[1], x[2])
above is doing. It's selecting the second and third columns (index 1 and 2) as the columns to sort on. You need a way to do that without hard-coding the integers 1 and 2 into the code.
Lets say you have a list of lists called s
and you want to sort by some subset of the columns in those lists.
s = < a list of lists >
orderings = [ 1, 2 ] # Could come from user input, for example.
s = sorted(s, key = lambda elem: tuple(map(elem.__getitem__, orderings)))
It turns out the indexing in python is actually syntactic sugar for calling the __getitem__
magic method. By mapping __getitem__
over every index in orderings
you can create a list of keys to sort on. You can then turn them into tuples
on the fly using the tuple
constructor. This happens once per row of s, essentially selecting out sorting keys. That's exactly what the sorted
function is looking for.
In my opinion, sorting data this way is great for one-off work, but it's difficult to read. In your question, you're supposing that you have a table of data in python and you want to do some sorting on it. The best way to handle that is to use an appropriate library for dealing with tabular data. I suggest the pandas dataframe library. Let's suppose your data is already in a pandas dataframe called df
with columns called first
, second
, and third
. Let's also suppose that you want to sort by first
ascending, then by third
descending.
df.sort_values(by=['first', 'third'], ascending=[True, False])
That's it. This function returns a new dataframe sorted by first
, then third
, in ascending and descending order, respectively. All you need to know to do this is the names of your columns and their sort directions. It's significantly cleaner than dealing with tuples and indices. The downside is that the pandas library has a lot of dependencies that can be difficult to install.
Upvotes: 0
Reputation: 152657
This makes mostly sense with dict
ionaries but the approach is similar to @wwii
's answer (I'm using keys instead of columns):
results = [{'name': 'Peter', 'score': 10, 'match': 0},
{'name': 'Wendy', 'score': 2, 'match': 1},
{'name': 'Hook', 'score': 1000, 'match': 0}]
from operator import itemgetter
orderby = ['match'] # define the keys by which to sort
sorted(results, key=itemgetter(*orderby))
gives:
[{'match': 0, 'name': 'Peter', 'score': 10},
{'match': 0, 'name': 'Hook', 'score': 1000},
{'match': 1, 'name': 'Wendy', 'score': 2}]
or:
orderby = ['match', 'name']
sorted(results, key=itemgetter(*orderby))
which gives:
[{'match': 0, 'name': 'Hook', 'score': 1000},
{'match': 0, 'name': 'Peter', 'score': 10},
{'match': 1, 'name': 'Wendy', 'score': 2}]
Upvotes: 1
Reputation: 308206
A simple way would be similar to what you already have:
s = sorted(s, key = lambda x: [x[i] for i in orderings])
Otherwise you can simply sort multiple times. Python sorts are stable, which means any elements that compare equal will keep their original order. By sorting multiple times from the least significant to the most significant key, you'll find the end result to be exactly what you need.
Upvotes: 4
Reputation: 23753
Use operator.itemgetter for the key function.
>>> import operator
>>> items = [1, 2, 4]
>>> key = operator.itemgetter(*items)
>>> key
operator.itemgetter(1, 2, 4)
>>> a = ['kljdfii', 'lkjfo', 'lklvjo']
>>> sorted(a, key = key)
['lkjfo', 'lklvjo', 'kljdfii']
>>>
Upvotes: 3