Reputation: 1614
i've seen solutions to sorting lists based on a fixed number: Sort a list by multiple attributes?
It has a nice sorted solution: s = sorted(s, key = lambda x: (x[1], x[2]))
and also the itemgetter example
However I have a varying number of attributes, an example can be with 2 attributes:
example_list = [{'d1_desc': 'a', 'd1_sort': 1, 'd2_desc': 'x': 'd2_sort': 30},
{'d1_desc': 'b', 'd1_sort': 2, 'd2_desc': 'x': 'd2_sort': 30},
{'d1_desc': 'a', 'd1_sort': 1, 'd2_desc': 'y': 'd2_sort': 35},
{'d1_desc': 'c', 'd1_sort': 3, 'd2_desc': 'z': 'd2_sort': 38},
etc.
]
But it can also be 1 or 3 or more. I cannot use this the lambda function like this or the itemgetter. However I do know the number of dimensions at time of execution (even though it varies from case to case). So i've made this (with parameter set for the 2 dim example):
example_list = [{'d1_desc': 'a', 'd1_sort': 1, 'd2_desc': 'x', 'd2_sort': 30},
{'d1_desc': 'b', 'd1_sort': 2, 'd2_desc': 'x', 'd2_sort': 30},
{'d1_desc': 'a', 'd1_sort': 1, 'd2_desc': 'y', 'd2_sort': 35},
{'d1_desc': 'c', 'd1_sort': 3, 'd2_desc': 'z', 'd2_sort': 38},
{'d1_desc': 'a', 'd1_sort': 1, 'd2_desc': 'z', 'd2_sort': 38},
{'d1_desc': 'b', 'd1_sort': 2, 'd2_desc': 'x', 'd2_sort': 30},
{'d1_desc': 'c', 'd1_sort': 3, 'd2_desc': 'x', 'd2_sort': 30},
{'d1_desc': 'b', 'd1_sort': 2, 'd2_desc': 'z', 'd2_sort': 38}
]
def order_get( a , nr):
result = []
for i in range(1, nr+1):
result.append(a.get('d' + str(i) + '_sort'))
return result
example_list.sort(key = lambda x: order_get(x, 2)) # for this example hard set to 2
In [82]: example_list
Out[82]:
[{'d1_desc': 'a', 'd1_sort': 1, 'd2_desc': 'x', 'd2_sort': 30},
{'d1_desc': 'a', 'd1_sort': 1, 'd2_desc': 'y', 'd2_sort': 35},
{'d1_desc': 'a', 'd1_sort': 1, 'd2_desc': 'z', 'd2_sort': 38},
{'d1_desc': 'b', 'd1_sort': 2, 'd2_desc': 'x', 'd2_sort': 30},
{'d1_desc': 'b', 'd1_sort': 2, 'd2_desc': 'x', 'd2_sort': 30},
{'d1_desc': 'b', 'd1_sort': 2, 'd2_desc': 'z', 'd2_sort': 38},
{'d1_desc': 'c', 'd1_sort': 3, 'd2_desc': 'x', 'd2_sort': 30},
{'d1_desc': 'c', 'd1_sort': 3, 'd2_desc': 'z', 'd2_sort': 38}]
But is this really the best way to do this? And with that i mean 1) Pythonic and 2) performance wise? Is this a common issue or not?
Upvotes: 1
Views: 2430
Reputation: 104102
You can support an arbitrary number of sort keys so long as they have a predictable pattern as keys.
Suppose you have d[X]_sort
to d[Y]_sort
where X and Y are integers and the sort keys all end in _sort
with a key function like so:
import re
def arb_kf(d):
li=filter(lambda s: s.endswith('_sort'), d)
rtr=[tuple(map(int, re.findall(r'([0-9]+)', k) + [d[k]])) for k in li]
rtr.sort()
return rtr
With your example list of dicts:
example_list = [{'d1_desc': 'a', 'd1_sort': 1, 'd2_desc': 'x', 'd2_sort': 30},
{'d1_desc': 'b', 'd1_sort': 2, 'd2_desc': 'x', 'd2_sort': 30},
{'d1_desc': 'a', 'd1_sort': 1, 'd2_desc': 'y', 'd2_sort': 35},
{'d1_desc': 'c', 'd1_sort': 3, 'd2_desc': 'z', 'd2_sort': 38},
{'d1_desc': 'a', 'd1_sort': 1, 'd2_desc': 'z', 'd2_sort': 38},
{'d1_desc': 'b', 'd1_sort': 2, 'd2_desc': 'x', 'd2_sort': 30},
{'d1_desc': 'c', 'd1_sort': 3, 'd2_desc': 'x', 'd2_sort': 30},
{'d1_desc': 'b', 'd1_sort': 2, 'd2_desc': 'z', 'd2_sort': 38}
]
>>> for d in sorted(example_list, key=arb_kf) :
... print d
{'d2_desc': 'x', 'd2_sort': 30, 'd1_sort': 1, 'd1_desc': 'a'}
{'d2_desc': 'y', 'd2_sort': 35, 'd1_sort': 1, 'd1_desc': 'a'}
{'d2_desc': 'z', 'd2_sort': 38, 'd1_sort': 1, 'd1_desc': 'a'}
{'d2_desc': 'x', 'd2_sort': 30, 'd1_sort': 2, 'd1_desc': 'b'}
{'d2_desc': 'x', 'd2_sort': 30, 'd1_sort': 2, 'd1_desc': 'b'}
{'d2_desc': 'z', 'd2_sort': 38, 'd1_sort': 2, 'd1_desc': 'b'}
{'d2_desc': 'x', 'd2_sort': 30, 'd1_sort': 3, 'd1_desc': 'c'}
{'d2_desc': 'z', 'd2_sort': 38, 'd1_sort': 3, 'd1_desc': 'c'}
Suppose that the integer in d[X]_sort
is different in some of the dicts and you want to give more weight to lower numbers; i.e., d0_sort
carries more sort weight than a dict that does not have the lower number.
Since Python sorts tuples element, this holds true:
>>> sorted([(1,99), (1,1,1), (0,50), (1,0,99)])
[(0, 50), (1, 0, 99), (1, 1, 1), (1, 99)]
Since the key function is returning a list of tuples, that also works in this case.
Then if your example list has a dict with 'd0_sort': 3
in it, that will sort higher than anything with 'd1_sort
in it:
example_list = [{'d1_desc': 'a', 'd1_sort': 1, 'd2_desc': 'x', 'd2_sort': 30},
{'d1_desc': 'b', 'd1_sort': 2, 'd2_desc': 'x', 'd2_sort': 30},
{'d1_desc': 'a', 'd1_sort': 1, 'd2_desc': 'y', 'd2_sort': 35},
{'d1_desc': 'c', 'd1_sort': 3, 'd2_desc': 'z', 'd2_sort': 38},
{'d1_desc': 'a', 'd1_sort': 1, 'd2_desc': 'z', 'd2_sort': 38},
{'d1_desc': 'b', 'd1_sort': 2, 'd2_desc': 'x', 'd2_sort': 30},
{'d1_desc': 'c', 'd1_sort': 3, 'd2_desc': 'x', 'd2_sort': 30},
{'d1_desc': 'b', 'd1_sort': 2, 'd2_desc': 'z', 'd2_sort': 38},
{'d1_desc': 'b', 'd0_sort': 3, 'd2_desc': 'z', 'd2_sort': 38}
]
>>> for d in sorted(example_list, key=arb_kf) :
... print d
{'d0_sort': 3, 'd2_desc': 'z', 'd2_sort': 38, 'd1_desc': 'b'}
{'d2_desc': 'x', 'd2_sort': 30, 'd1_sort': 1, 'd1_desc': 'a'}
{'d2_desc': 'y', 'd2_sort': 35, 'd1_sort': 1, 'd1_desc': 'a'}
{'d2_desc': 'z', 'd2_sort': 38, 'd1_sort': 1, 'd1_desc': 'a'}
{'d2_desc': 'x', 'd2_sort': 30, 'd1_sort': 2, 'd1_desc': 'b'}
{'d2_desc': 'x', 'd2_sort': 30, 'd1_sort': 2, 'd1_desc': 'b'}
{'d2_desc': 'z', 'd2_sort': 38, 'd1_sort': 2, 'd1_desc': 'b'}
{'d2_desc': 'x', 'd2_sort': 30, 'd1_sort': 3, 'd1_desc': 'c'}
{'d2_desc': 'z', 'd2_sort': 38, 'd1_sort': 3, 'd1_desc': 'c'}
Upvotes: 1
Reputation: 31534
I would still use an itemgetter
since it's faster and you create it once and use every time:
from operator import itemgetter
def make_getter(nr):
keys = ('d%d_sort' % (n + 1) for n in xrange(nr))
return itemgetter(*keys)
example_list.sort(key=make_getter(2))
To create the itemgetter
requires time. If you have to use it on more than one list, since it is always the same, store it get_two = make_getter(2)
and use get_two
as key
function.
Upvotes: 1