How to get a set of keys with largest values?

Question

I am working on a function

def common_words(dictionary, N):
     if len(dictionary) > N:
         max(dictionary, key=dictionary.get)

Description of the function is:

The first parameter is the dictionary of word counts and the second is a positive integer N. This function should update the dictionary so that it includes the most common (highest frequency words). At most N words should be included in the dictionary. If including all words with some word count would result in a dictionary with more than N words, then none of the words with that word count should be included. (i.e., in the case of a tie for the N+1st most common word, omit all of the words in the tie.)

So I know that I need to get the N items with the highest values but I am not sure how to do that. I also know that once I get N items that if there are any duplicate values that I need to pop them out.

For example, given

k = {'a':5, 'b':4, 'c':4, 'd':1}

then

common_words(k, 2)

should modify k so that it becomes {'a':5}.

Skycc · Accepted Answer

my algorithm as below

1st build tuple list from dictionary sorted based on value from largest to smallest
check for if item[N-1] match item[N] value, if yes, drop item[N-1] (index start from 0, so -1 there)
finally, convert the slice of tuple list up to N element back to dict, may change to use OrderedDict here if wanna retain the items order

it will just return the dictionary as it is if the dictionary length is less than N

def common_words(dictionary, N):
    if len(dictionary) > N:
        tmp = [(k,dictionary[k]) for k in sorted(dictionary, key=dictionary.get, reverse=True)]
        if tmp[N-1][1] == tmp[N][1]:
            N -= 1
        return dict(tmp[:N])
        # return [i[0] for i in tmp[:N]] # comment line above and uncomment this line to get keys only as your title mention how to get keys
    else:
        return dictionary
        # return dictionary.keys() # comment line above and uncomment this line to get keys only as your title mention how to get keys

>>> common_words({'a':5, 'b':4, 'c':4, 'd':1}, 2)
{'a': 5}

OP wanna modify input dictionary within function and return None, it can be modified as below

def common_words(dictionary, N):
    if len(dictionary) > N:
        tmp = [(k,dictionary[k]) for k in sorted(dictionary, key=dictionary.get, reverse=True)]
        if tmp[N-1][1] == tmp[N][1]:
            N -= 1
        # return dict(tmp[:N])
        for i in tmp[N:]:
            dictionary.pop(i[0])

>>> k = {'a':5, 'b':4, 'c':4, 'd':1}
>>> common_words(k, 2)
>>> k
{'a': 5}

How to get a set of keys with largest values?

Answers (2)

Related Questions