Reputation: 1614

Dynamic sorting of list on varying number of attributes

i've seen solutions to sorting lists based on a fixed number: Sort a list by multiple attributes?

It has a nice sorted solution: s = sorted(s, key = lambda x: (x[1], x[2]))

and also the itemgetter example

However I have a varying number of attributes, an example can be with 2 attributes:

example_list = [{'d1_desc': 'a', 'd1_sort': 1, 'd2_desc': 'x': 'd2_sort': 30},
    {'d1_desc': 'b', 'd1_sort': 2, 'd2_desc': 'x': 'd2_sort': 30},
    {'d1_desc': 'a', 'd1_sort': 1, 'd2_desc': 'y': 'd2_sort': 35},
    {'d1_desc': 'c', 'd1_sort': 3, 'd2_desc': 'z': 'd2_sort': 38},
    etc.
]

But it can also be 1 or 3 or more. I cannot use this the lambda function like this or the itemgetter. However I do know the number of dimensions at time of execution (even though it varies from case to case). So i've made this (with parameter set for the 2 dim example):

example_list = [{'d1_desc': 'a', 'd1_sort': 1, 'd2_desc': 'x', 'd2_sort': 30},
    {'d1_desc': 'b', 'd1_sort': 2, 'd2_desc': 'x', 'd2_sort': 30},
    {'d1_desc': 'a', 'd1_sort': 1, 'd2_desc': 'y', 'd2_sort': 35},
    {'d1_desc': 'c', 'd1_sort': 3, 'd2_desc': 'z', 'd2_sort': 38},
    {'d1_desc': 'a', 'd1_sort': 1, 'd2_desc': 'z', 'd2_sort': 38},
    {'d1_desc': 'b', 'd1_sort': 2, 'd2_desc': 'x', 'd2_sort': 30},
    {'d1_desc': 'c', 'd1_sort': 3, 'd2_desc': 'x', 'd2_sort': 30},
    {'d1_desc': 'b', 'd1_sort': 2, 'd2_desc': 'z', 'd2_sort': 38}
]

def order_get( a , nr):
    result = []
    for i in range(1, nr+1):
        result.append(a.get('d' + str(i) + '_sort'))
    return result

example_list.sort(key = lambda x: order_get(x, 2)) # for this example hard set to 2

In [82]: example_list
Out[82]: 
[{'d1_desc': 'a', 'd1_sort': 1, 'd2_desc': 'x', 'd2_sort': 30},
 {'d1_desc': 'a', 'd1_sort': 1, 'd2_desc': 'y', 'd2_sort': 35},
 {'d1_desc': 'a', 'd1_sort': 1, 'd2_desc': 'z', 'd2_sort': 38},
 {'d1_desc': 'b', 'd1_sort': 2, 'd2_desc': 'x', 'd2_sort': 30},
 {'d1_desc': 'b', 'd1_sort': 2, 'd2_desc': 'x', 'd2_sort': 30},
 {'d1_desc': 'b', 'd1_sort': 2, 'd2_desc': 'z', 'd2_sort': 38},
 {'d1_desc': 'c', 'd1_sort': 3, 'd2_desc': 'x', 'd2_sort': 30},
 {'d1_desc': 'c', 'd1_sort': 3, 'd2_desc': 'z', 'd2_sort': 38}]

But is this really the best way to do this? And with that i mean 1) Pythonic and 2) performance wise? Is this a common issue or not?

Upvotes: 1

Answers (2)

dawg

Reputation: 104102

You can support an arbitrary number of sort keys so long as they have a predictable pattern as keys.

Suppose you have d[X]_sort to d[Y]_sort where X and Y are integers and the sort keys all end in _sort with a key function like so:

import re

def arb_kf(d): 
    li=filter(lambda s: s.endswith('_sort'), d) 
    rtr=[tuple(map(int, re.findall(r'([0-9]+)', k) + [d[k]])) for k in li]
    rtr.sort()            
    return rtr

With your example list of dicts:

example_list = [{'d1_desc': 'a', 'd1_sort': 1, 'd2_desc': 'x', 'd2_sort': 30},
    {'d1_desc': 'b', 'd1_sort': 2, 'd2_desc': 'x', 'd2_sort': 30},
    {'d1_desc': 'a', 'd1_sort': 1, 'd2_desc': 'y', 'd2_sort': 35},
    {'d1_desc': 'c', 'd1_sort': 3, 'd2_desc': 'z', 'd2_sort': 38},
    {'d1_desc': 'a', 'd1_sort': 1, 'd2_desc': 'z', 'd2_sort': 38},
    {'d1_desc': 'b', 'd1_sort': 2, 'd2_desc': 'x', 'd2_sort': 30},
    {'d1_desc': 'c', 'd1_sort': 3, 'd2_desc': 'x', 'd2_sort': 30},
    {'d1_desc': 'b', 'd1_sort': 2, 'd2_desc': 'z', 'd2_sort': 38}
]


>>> for d in sorted(example_list, key=arb_kf) :
...     print d  
{'d2_desc': 'x', 'd2_sort': 30, 'd1_sort': 1, 'd1_desc': 'a'}
{'d2_desc': 'y', 'd2_sort': 35, 'd1_sort': 1, 'd1_desc': 'a'}
{'d2_desc': 'z', 'd2_sort': 38, 'd1_sort': 1, 'd1_desc': 'a'}
{'d2_desc': 'x', 'd2_sort': 30, 'd1_sort': 2, 'd1_desc': 'b'}
{'d2_desc': 'x', 'd2_sort': 30, 'd1_sort': 2, 'd1_desc': 'b'}
{'d2_desc': 'z', 'd2_sort': 38, 'd1_sort': 2, 'd1_desc': 'b'}
{'d2_desc': 'x', 'd2_sort': 30, 'd1_sort': 3, 'd1_desc': 'c'}
{'d2_desc': 'z', 'd2_sort': 38, 'd1_sort': 3, 'd1_desc': 'c'}

Suppose that the integer in d[X]_sort is different in some of the dicts and you want to give more weight to lower numbers; i.e., d0_sort carries more sort weight than a dict that does not have the lower number.

Since Python sorts tuples element, this holds true:

>>> sorted([(1,99), (1,1,1), (0,50), (1,0,99)])
[(0, 50), (1, 0, 99), (1, 1, 1), (1, 99)]

Since the key function is returning a list of tuples, that also works in this case.

Then if your example list has a dict with 'd0_sort': 3 in it, that will sort higher than anything with 'd1_sort in it:

example_list = [{'d1_desc': 'a', 'd1_sort': 1, 'd2_desc': 'x', 'd2_sort': 30},
    {'d1_desc': 'b', 'd1_sort': 2, 'd2_desc': 'x', 'd2_sort': 30},
    {'d1_desc': 'a', 'd1_sort': 1, 'd2_desc': 'y', 'd2_sort': 35},
    {'d1_desc': 'c', 'd1_sort': 3, 'd2_desc': 'z', 'd2_sort': 38},
    {'d1_desc': 'a', 'd1_sort': 1, 'd2_desc': 'z', 'd2_sort': 38},
    {'d1_desc': 'b', 'd1_sort': 2, 'd2_desc': 'x', 'd2_sort': 30},
    {'d1_desc': 'c', 'd1_sort': 3, 'd2_desc': 'x', 'd2_sort': 30},
    {'d1_desc': 'b', 'd1_sort': 2, 'd2_desc': 'z', 'd2_sort': 38},
    {'d1_desc': 'b', 'd0_sort': 3, 'd2_desc': 'z', 'd2_sort': 38}
]
>>> for d in sorted(example_list, key=arb_kf) :
...     print d  
{'d0_sort': 3, 'd2_desc': 'z', 'd2_sort': 38, 'd1_desc': 'b'}
{'d2_desc': 'x', 'd2_sort': 30, 'd1_sort': 1, 'd1_desc': 'a'}
{'d2_desc': 'y', 'd2_sort': 35, 'd1_sort': 1, 'd1_desc': 'a'}
{'d2_desc': 'z', 'd2_sort': 38, 'd1_sort': 1, 'd1_desc': 'a'}
{'d2_desc': 'x', 'd2_sort': 30, 'd1_sort': 2, 'd1_desc': 'b'}
{'d2_desc': 'x', 'd2_sort': 30, 'd1_sort': 2, 'd1_desc': 'b'}
{'d2_desc': 'z', 'd2_sort': 38, 'd1_sort': 2, 'd1_desc': 'b'}
{'d2_desc': 'x', 'd2_sort': 30, 'd1_sort': 3, 'd1_desc': 'c'}
{'d2_desc': 'z', 'd2_sort': 38, 'd1_sort': 3, 'd1_desc': 'c'}

Upvotes: 1

enrico.bacis

Reputation: 31534

I would still use an itemgetter since it's faster and you create it once and use every time:

from operator import itemgetter

def make_getter(nr):
    keys = ('d%d_sort' % (n + 1) for n in xrange(nr))
    return itemgetter(*keys)

example_list.sort(key=make_getter(2))

To create the itemgetter requires time. If you have to use it on more than one list, since it is always the same, store it get_two = make_getter(2) and use get_two as key function.

Upvotes: 1

Dynamic sorting of list on varying number of attributes

Answers (2)

Related Questions