Reputation: 107
If i have a dictionary that records the count frequency of random objects:
dict = {'oranges': 4 , 'apple': 3 , 'banana': 3 , 'pear' :1, 'strawberry' : 1....}
And I want only the keys that are in the top 25th percentile by frequency, how would i do that ? Especially if it's a very long tail list and a lot of records will have the same count.
Upvotes: 2
Views: 1274
Reputation: 78554
Use a collections.Counter
object and exploit its most_common
method to return the keys with the highest frequency up to the required percentile.
For the 25th percentile, divide the length of the dictionary by 4 and pass that value to most_common
:
>>> from collections import Counter
>>> dct = {'oranges': 4 , 'apple': 3 , 'banana': 3 , 'pear' :1, 'strawberry' : 1}
>>> c = Counter(dct)
>>> [tup[0] for tup in c.most_common(len(dct)//4)]
['oranges']
Note that potential elements in that percentile with equal frequencies will be selected arbitrarily.
Upvotes: 3