Reputation: 531
I've got a list of object in python which i want to sort based on a attribute.
For eg:
abc is a class with attributes id and count.
I have a list objects of the class abc.
list=[abc('1',120),abc('2',0),abc('0',180),abc('5',150)].
I want to sort the list in ascending order of the attribute 'count'
I have done it using:
list.sort(key=attrgetter('count'))
I have found using profiling my python script that it takes lot of time for sorting.
Can anyone suggest a better and faster way to sort list of object based on a attribute minimizing the time for sorting.
Upvotes: 2
Views: 4905
Reputation: 17715
I believe the sort
method is implemented using Timsort algorithm, so there is not much you can improve in terms of sorting.
What you can do is to insert the elements differently provided you have the control over the inserting part of code.
For example you could use a binary heap to optimize retreiving of the largest element (see heapq
module in Python) or binary search tree to maintain the sorting order.
The data strcuture you choose, mainly depends on what do you want to do with the elements.
Upvotes: 0
Reputation: 4314
A nitpick: you're using the name list
for your list which will overwrite the standard list
class. You'd be better off using l
as the list name.
I tested sorting a list containing 12 times the contents of your list 100000 times. It took 0.848 s without a comparator function or a key when I used the sorted() function to avoid re-sorting an already sorted list.
There are at least three ways I can think of:
A. Use the sort() with the comparator function:
def comparator(x, y):
return cmp(x.count, y.count)
l.sort(cmp=comparator)
This took 9.598 s on my system when I used the sorted() function to avoid re-sorting an already sorted list.
B. Use the sort() with the key function:
l.sort(key=operator.attrgetter('count'))
This took 3.111 s on my system when I used the sorted() function to avoid re-sorting an already sorted list.
C. Use native C code to improve the performance of the sorting. I didn't test this.
So, you seem to be already using the fastest all-Python way there is and the way forward would be the use of native C code.
Upvotes: 2
Reputation: 9745
If I understand you correctly:
class Abc(object):
def __init__(self, name, count):
self.name = name
self.count = count
@classmethod
def sort_key(cls, key):
if key == 'count':
return lambda obj: obj.count
elif key == 'name':
return lambda obj: obj.name
lst = [Abc('1', 120), Abc('2', 0), Abc('0', 180), Abc('5', 150)]
lst.sort(key=Abc.sort_key('count'))
for e in lst:
print e.name, e.count
print
lst.sort(key=Abc.sort_key('name'))
for e in lst:
print e.name, e.count
print
I'd not recommend you use 'id', 'abc' and 'list' as names of arbitrary variables because they are keywords in python.
Upvotes: 0