user4128992
user4128992

Reputation:

Top-k on a list of dict in python

Is there an easy way to perform the max k number of key:values pair in this example

s1 = {'val' : 0}
s2 = {'val': 10}
s3 = {'val': 5}
s4 = {'val' : 4}
s5 = {'val' : 6}
s6 = {'val' : 7}
s7 = {'val' : 3}
shapelets = [s1,s2,s3,s4,s5,s6,s7]

I want to get the max 5 numbers in the shapelets list, knowing that it contains a key of name "val" and to which a value is assigned. The solution here resides in parsing through the list of dict elements and get the max n numbers of it ( in this case the max 5 values )

What can be a simple solution, does operator library in python supports such operation ?

Upvotes: 3

Views: 841

Answers (3)

Padraic Cunningham
Padraic Cunningham

Reputation: 180481

You could do it in linear time using numpy.argpartition:

from operator import itemgetter
import numpy as np
arr = np.array(list(map(itemgetter("val"), shapelets)))

print(arr[np.argpartition(arr, -5)][-5:])

The 5 max values will not necessarily be in order, if you want that then you would need to sort the k elements returned.

Upvotes: 1

Chris_Rands
Chris_Rands

Reputation: 41188

You can use heapq:

import heapq

s1 = {'val': 0}
s2 = {'val': 10}
s3 = {'val': 5}
s4 = {'val': 4}
s5 = {'val': 6}
s6 = {'val': 7}
s7 = {'val': 3}
shapelets = [s1, s2, s3, s4, s5, s6, s7]

heapq.nlargest(5,[dct['val'] for dct in shapelets])
# [10, 7, 6, 5, 4]

heapq is likely to be faster than sorted for large lists if you only want a few of the largest values. Some discussions of heapq vs. sorted are here.

Upvotes: 0

BPL
BPL

Reputation: 9863

Here's a working example:

s1 = {'val': 0}
s2 = {'val': 10}
s3 = {'val': 5}
s4 = {'val': 4}
s5 = {'val': 6}
s6 = {'val': 7}
s7 = {'val': 3}
shapelets = [s1, s2, s3, s4, s5, s6, s7]

print(sorted(shapelets, key=lambda x: x['val'])[-5:])

Upvotes: 1

Related Questions