pulsar
pulsar

Reputation: 141

key function for heapq.nlargest()

I have a dictionary with {key: count}, say status_count = {'MANAGEMENT ANALYSTS': 13859, 'COMPUTER PROGRAMMERS': 72112} and I am trying to write a key function for heapq.nlargest() that sorts based on count and if there are ties I have to sort based on alphabetical order(a-z) of keys. I have to use heapq.nlargest because of very large N and small k = 10.

This is what I got until now,

top_k_results = heapq.nlargest(args.top_k, status_count.items(), key=lambda item: (item[1], item[0])) But, this would be incorrect in case of breaking ties with alphabetical order. Please help!

Upvotes: 4

Views: 4449

Answers (1)

jpp
jpp

Reputation: 164663

Simplest may be to switch to heapq.nsmallest and redefine your sort key:

from heapq import nsmallest

def sort_key(x):
    return -x[1], x[0]

top_k_results = nsmallest(args.top_k, status_count.items(), key=sort_key)

Alternatively, you can use ord and take the negative for ascending order:

from heapq import nlargest

def sort_key(x):
    return x[1], [-ord(i) for i in x[0]]

top_k_results = nlargest(args.top_k, status_count.items(), key=sort_key)

Remember to use str.casefold if you need to normalize the case of your string.

Upvotes: 1

Related Questions