bennietgek
bennietgek

Reputation: 65

Counting most prevalent item in dictionary of lists python

I have a dictionary that looks like this:

dict_users = {
    "user1": ["item1", "item2", "item3", "item1", "item2", "elem3", "thing4", "thing5", "thing6"],
    "user2": ["elem5", "elem8", "elem2", "elem3", "elem8", "elem5", "thing7", "thing1", "thing9"],
    "user3": ["thing9", "thing7", "thing1", "thing4", "elem3", "elem9", "thing3", "thing5", "thing2"],
}

Now from here, I would like to build a new dictionary that couples the users to the item that is used the most in their list, so in this case the output for the example would be:

dict_counted = {
'user1': 'item'
'user2': 'elem'
'user3': 'thing'
}

I now have something like this:

users = ['user1', 'user2', 'user3']

dictOfRatios = dict.fromkeys(users)

for key, value in dict_users.items():
    for value in dict_sers[key]:
        if value.startswith("item"):
            itemlist = list(value)
            for user in dictOfRatios:
                dictOfRatios[user] = len(itemlist)
                
print(dictOfRatios)

But the ouptut is not as desired and it even gives the wrong number... The criteria for matching in this case could be anything ranging from i, e, t to complete item, elem, thing.

Upvotes: 0

Views: 83

Answers (3)

Heshan Kumarasinghe
Heshan Kumarasinghe

Reputation: 83

Try this:

dict_users = {
    'user1': ['a', 'a', 'a', 'b', 'b', 'c'],
    'user2': ['1', '1', '1', '2', '2', '2', '2'],
    'user3': ['!', '!', '!', '!', '@', '@', '@', '@', '@']
}

unique_values = {}
final_dict = {}

for key, value in dict_users.items():
    unique_values[key] = set(value)

count = 0
for key, value in unique_values.items():
    for el in value:
        if dict_users[key].count(el) > count:
            count = dict_users[key].count(el)
        final_dict[key] = el

print(final_dict)

This gives takes the dictionary,

dict_users = {
    'user1': ['a', 'a', 'a', 'b', 'b', 'c'],
    'user2': ['1', '1', '1', '2', '2', '2', '2'],
    'user3': ['!', '!', '!', '!', '@', '@', '@', '@', '@']
}

and gives you,

{'user1': 'a', 'user2': '2', 'user3': '@'}

I hope this is what you have wanted to achieve. :)

Upvotes: 1

woblob
woblob

Reputation: 1377

python counter is what you need

from collections import Counter
import re

dict_users = {
'user1': ['item1', 'item2', 'item3', 'item1', 'item2', 'elem3', 'thing4', 'thing5', 'thing6'],
'user2': ['elem5', 'elem8', 'elem2', 'elem3', 'elem8', 'elem5', 'thing7', 'thing1', 'thing9'],
'user3': ['thing9', 'thing7', 'thing1', 'thing4', 'elem3', 'elem9', 'thing3', 'thing5', 'thing2']
}

users = {user: Counter() for user in dict_users.keys()}

for us, lst in dict_users.items():
    user_counter = users[us]
    for el in lst:
        item_name = re.split("\d",el)[0]
        user_counter[item_name] += 1

dict_counted = {user: counter.most_common(1)[0][0] for user, counter in users.items()}
print(dict_counted)

Outputs:

{
 'user1': 'item',
 'user2': 'elem',
 'user3': 'thing'
}

Upvotes: 1

leoOrion
leoOrion

Reputation: 1957

In your code -

itemlist = list(value)

This will set the same list again to item list. When you do len on it, you will get the length of the full list.

This will solve your problem

dict_users = {'user1': ['item1', 'item2', 'item3', 'item1', 'item2', 'elem3', 'thing4', 'thing5', 'thing6'],'user2': ['elem5', 'elem8', 'elem2', 'elem3', 'elem8', 'elem5', 'thing7', 'thing1', 'thing9'],'user3': ['thing9', 'thing7', 'thing1', 'thing4', 'elem3', 'elem9', 'thing3', 'thing5', 'thing2']}

new_dict = {}
for user, value in dict_users.items():
    item_count = sum([1 for each in value if each.startswith('item')])
    elem_count = sum([1 for each in value if each.startswith('elem')])
    thing_count = sum([1 for each in value if each.startswith('user')])
    max_count = item_count
    new_value = 'item'
    if elem_count > max_count:
        max_count = elem_count
        new_value = 'elem'
    if  thing_count > max_count:
        max_count = thing_count
        new_value = 'thing'
    new_dict[user]  = new_value
     

Edit:

Just saw that the list values may have single characters to denote item, elem & thing.

Look into regex and how to match with it. The same code but instead of using startswith, use regex to match.

Upvotes: 1

Related Questions