headache
headache

Reputation: 8977

Sort a list by multiple attributes?

I have a list of lists:

[[12, 'tall', 'blue', 1],
[2, 'short', 'red', 9],
[4, 'tall', 'blue', 13]]

If I wanted to sort by one element, say the tall/short element, I could do it via s = sorted(s, key = itemgetter(1)).

If I wanted to sort by both tall/short and colour, I could do the sort twice, once for each element, but is there a quicker way?

Upvotes: 740

Views: 627824

Answers (8)

ron_g
ron_g

Reputation: 1663

Several years late to the party but I want to both sort on 2 criteria and use reverse=True. In case someone else wants to know how, you can wrap your criteria (functions) in parenthesis:

s = sorted(my_list, key=lambda x: ( criteria_1(x), criteria_2(x) ), reverse=True)

Example:

# Let's say we have a list of students with (name, grade, age)
students = [
    ("Alice", 95, 21),
    ("Bob", 95, 19),
    ("Charlie", 88, 22),
    ("David", 88, 20),
    ("Eve", 92, 21)
]

# Simple functions that return grade / age
def grade_criteria(student): return student[1]
def age_criteria(student): return student[2]

# Sort both grade and age, descending (reverse=True)
sorted_students = sorted(students, 
                           key=lambda x: (grade_criteria(x), age_criteria(x)), 
                           reverse=True)

# Students sorted by grade and age (both descending)
for student in sorted_students:
    print(f"Name: {student[0]}, Grade: {student[1]}, Age: {student[2]}")

# Output:
# Name: Alice, Grade: 95, Age: 21
# Name: Bob, Grade: 95, Age: 19
# Name: Eve, Grade: 92, Age: 21
# Name: Charlie, Grade: 88, Age: 22
# Name: David, Grade: 88, Age: 20

You can have both ascending by leaving out reverse=True, and use negation to reverse just one of the criteria

# Grade descending but age ascending
sorted_2 = sorted(students, 
                 key=lambda x: (grade_criteria(x), -age_criteria(x)), 
                 reverse=False)
# Result: [("Bob", 95, 19), ("Alice", 95, 21), ("David", 88, 20), ("Charlie", 88, 22)]

Upvotes: 27

pymen
pymen

Reputation: 6549

Multisort with ability to specify ascending/descending order per each attribute

from operator import itemgetter, attrgetter
from functools import cmp_to_key


def multikeysort(items, *columns, attrs=True) -> list:
    """
    Perform a multiple column sort on a list of dictionaries or objects.
    Args:
        items (list): List of dictionaries or objects to be sorted.
        *columns: Columns to sort by, optionally preceded by a '-' for descending order.
        attrs (bool): True if items are objects, False if items are dictionaries.

    Returns:
        list: Sorted list of items.
    """
    getter = attrgetter if attrs else itemgetter

    def get_comparers():
        comparers = []

        for col in columns:
            col = col.strip()
            if col.startswith('-'):  # If descending, strip '-' and create a comparer with reverse order
                key = getter(col[1:])
                order = -1
            else:  # If ascending, use the column directly
                key = getter(col)
                order = 1

            comparers.append((key, order))
        return comparers

    def custom_compare(left, right):
        """Custom comparison function to handle multiple keys"""
        for fn, reverse in get_comparers():
            result = (fn(left) > fn(right)) - (fn(left) < fn(right))
            if result != 0:
                return result * reverse
        return 0

    return sorted(items, key=cmp_to_key(custom_compare))

Usage/test with SORT by DESC('opens'), ASC('clicks')

def test_sort_objects(self):
    Customer = namedtuple('Customer', ['id', 'opens', 'clicks'])

    customer1 = Customer(id=1, opens=4, clicks=8)
    customer2 = Customer(id=2, opens=4, clicks=7)
    customer3 = Customer(id=2, opens=5, clicks=1)
    customers = [customer1, customer2, customer3]

    sorted_customers = multikeysort(customers, '-opens', 'clicks')
    exp_sorted_customers = [customer3, customer2, customer1]
    self.assertEqual(exp_sorted_customers, sorted_customers)

Upvotes: 0

convert the list of list into a list of tuples then sort the tuple by multiple fields.

 data=[[12, 'tall', 'blue', 1],[2, 'short', 'red', 9],[4, 'tall', 'blue', 13]]

 data=[tuple(x) for x in data]
 result = sorted(data, key = lambda x: (x[1], x[2]))
 print(result)

output:

 [(2, 'short', 'red', 9), (12, 'tall', 'blue', 1), (4, 'tall', 'blue', 13)]

Upvotes: 7

baz
baz

Reputation: 1587

There is a operator < between lists e.g.:

[12, 'tall', 'blue', 1] < [4, 'tall', 'blue', 13]

will give

False

Upvotes: 0

UpAndAdam
UpAndAdam

Reputation: 5467

It appears you could use a list instead of a tuple. This becomes more important I think when you are grabbing attributes instead of 'magic indexes' of a list/tuple.

In my case I wanted to sort by multiple attributes of a class, where the incoming keys were strings. I needed different sorting in different places, and I wanted a common default sort for the parent class that clients were interacting with; only having to override the 'sorting keys' when I really 'needed to', but also in a way that I could store them as lists that the class could share

So first I defined a helper method

def attr_sort(self, attrs=['someAttributeString']:
  '''helper to sort by the attributes named by strings of attrs in order'''
  return lambda k: [ getattr(k, attr) for attr in attrs ]

then to use it

# would defined elsewhere but showing here for consiseness
self.SortListA = ['attrA', 'attrB']
self.SortListB = ['attrC', 'attrA']
records = .... #list of my objects to sort
records.sort(key=self.attr_sort(attrs=self.SortListA))
# perhaps later nearby or in another function
more_records = .... #another list
more_records.sort(key=self.attr_sort(attrs=self.SortListB))

This will use the generated lambda function sort the list by object.attrA and then object.attrB assuming object has a getter corresponding to the string names provided. And the second case would sort by object.attrC then object.attrA.

This also allows you to potentially expose outward sorting choices to be shared alike by a consumer, a unit test, or for them to perhaps tell you how they want sorting done for some operation in your api by only have to give you a list and not coupling them to your back end implementation.

Upvotes: 7

Mark Byers
Mark Byers

Reputation: 838216

A key can be a function that returns a tuple:

s = sorted(s, key = lambda x: (x[1], x[2]))

Or you can achieve the same using itemgetter (which is faster and avoids a Python function call):

import operator
s = sorted(s, key = operator.itemgetter(1, 2))

And notice that here you can use sort instead of using sorted and then reassigning:

s.sort(key = operator.itemgetter(1, 2))

Upvotes: 1259

Dominic Suciu
Dominic Suciu

Reputation: 131

Here's one way: You basically re-write your sort function to take a list of sort functions, each sort function compares the attributes you want to test, on each sort test, you look and see if the cmp function returns a non-zero return if so break and send the return value. You call it by calling a Lambda of a function of a list of Lambdas.

Its advantage is that it does single pass through the data not a sort of a previous sort as other methods do. Another thing is that it sorts in place, whereas sorted seems to make a copy.

I used it to write a rank function, that ranks a list of classes where each object is in a group and has a score function, but you can add any list of attributes. Note the un-lambda-like, though hackish use of a lambda to call a setter. The rank part won't work for an array of lists, but the sort will.

#First, here's  a pure list version
my_sortLambdaLst = [lambda x,y:cmp(x[0], y[0]), lambda x,y:cmp(x[1], y[1])]
def multi_attribute_sort(x,y):
    r = 0
    for l in my_sortLambdaLst:
        r = l(x,y)
        if r!=0: return r #keep looping till you see a difference
    return r

Lst = [(4, 2.0), (4, 0.01), (4, 0.9), (4, 0.999),(4, 0.2), (1, 2.0), (1, 0.01), (1, 0.9), (1, 0.999), (1, 0.2) ]
Lst.sort(lambda x,y:multi_attribute_sort(x,y)) #The Lambda of the Lambda
for rec in Lst: print str(rec)

Here's a way to rank a list of objects

class probe:
    def __init__(self, group, score):
        self.group = group
        self.score = score
        self.rank =-1
    def set_rank(self, r):
        self.rank = r
    def __str__(self):
        return '\t'.join([str(self.group), str(self.score), str(self.rank)]) 


def RankLst(inLst, group_lambda= lambda x:x.group, sortLambdaLst = [lambda x,y:cmp(x.group, y.group), lambda x,y:cmp(x.score, y.score)], SetRank_Lambda = lambda x, rank:x.set_rank(rank)):
    #Inner function is the only way (I could think of) to pass the sortLambdaLst into a sort function
    def multi_attribute_sort(x,y):
        r = 0
        for l in sortLambdaLst:
            r = l(x,y)
            if r!=0: return r #keep looping till you see a difference
        return r

    inLst.sort(lambda x,y:multi_attribute_sort(x,y))
    #Now Rank your probes
    rank = 0
    last_group = group_lambda(inLst[0])
    for i in range(len(inLst)):
        rec = inLst[i]
        group = group_lambda(rec)
        if last_group == group: 
            rank+=1
        else:
            rank=1
            last_group = group
        SetRank_Lambda(inLst[i], rank) #This is pure evil!! The lambda purists are gnashing their teeth

Lst = [probe(4, 2.0), probe(4, 0.01), probe(4, 0.9), probe(4, 0.999), probe(4, 0.2), probe(1, 2.0), probe(1, 0.01), probe(1, 0.9), probe(1, 0.999), probe(1, 0.2) ]

RankLst(Lst, group_lambda= lambda x:x.group, sortLambdaLst = [lambda x,y:cmp(x.group, y.group), lambda x,y:cmp(x.score, y.score)], SetRank_Lambda = lambda x, rank:x.set_rank(rank))
print '\t'.join(['group', 'score', 'rank']) 
for r in Lst: print r

Upvotes: 3

Clint Blatchford
Clint Blatchford

Reputation: 599

I'm not sure if this is the most pythonic method ... I had a list of tuples that needed sorting 1st by descending integer values and 2nd alphabetically. This required reversing the integer sort but not the alphabetical sort. Here was my solution: (on the fly in an exam btw, I was not even aware you could 'nest' sorted functions)

a = [('Al', 2),('Bill', 1),('Carol', 2), ('Abel', 3), ('Zeke', 2), ('Chris', 1)]  
b = sorted(sorted(a, key = lambda x : x[0]), key = lambda x : x[1], reverse = True)  
print(b)  
[('Abel', 3), ('Al', 2), ('Carol', 2), ('Zeke', 2), ('Bill', 1), ('Chris', 1)]

Upvotes: 59

Related Questions