ugradmath
ugradmath

Reputation: 107

Ranking python dictionary by percentile

If i have a dictionary that records the count frequency of random objects:

dict = {'oranges': 4 , 'apple': 3 , 'banana': 3 , 'pear' :1, 'strawberry' : 1....}

And I want only the keys that are in the top 25th percentile by frequency, how would i do that ? Especially if it's a very long tail list and a lot of records will have the same count.

Upvotes: 2

Views: 1274

Answers (1)

Moses Koledoye
Moses Koledoye

Reputation: 78554

Use a collections.Counter object and exploit its most_common method to return the keys with the highest frequency up to the required percentile.

For the 25th percentile, divide the length of the dictionary by 4 and pass that value to most_common:

>>> from collections import Counter
>>> dct = {'oranges': 4 , 'apple': 3 , 'banana': 3 , 'pear' :1, 'strawberry' : 1}
>>> c = Counter(dct)
>>> [tup[0] for tup in c.most_common(len(dct)//4)]
['oranges']

Note that potential elements in that percentile with equal frequencies will be selected arbitrarily.

Upvotes: 3

Related Questions