Reputation: 1560

Counting most common value in Arbitrary number of dictionaries

I have hundreds of dictionaries that look like this. They all have the same keys (New York, Chicago etc..) but have different values. There are no missing values.

[{'New York': 'cloudy', 'Chicago': 'snowy', 'Seattle': 'rainy'},
 {'New York': 'cloudy', 'Chicago': 'hailing', 'Seattle': 'sunny'},
 {'New York': 'sunny', 'Chicago': 'snowy', 'Seattle': 'rainy'}, 
 {'New York': 'hailing', 'Chicago': 'snowy', 'Seattle':'snowy'}]

I want to count the most common "weather" value for each key. And then combine them all into one final list that just outputs each city with the most common key value it has.

{'New York': 'cloudy', 'Chicago': 'snowy', 'Seattle': 'rainy'}

How can I achieve this?

Upvotes: 2

Answers (3)

U13-Forward

Reputation: 71570

Why not a one-liner dictionary comprehension then?, like:

print({k:max([i[k] for i in weather],key=[i[k] for i in weather].count) for k in list(weather[0].keys())})

Now you get the desired output as:

{'New York': 'cloudy', 'Chicago': 'snowy', 'Seattle': 'rainy'}

Upvotes: 0

Ajax1234

Reputation: 71451

You can iterate over the list to group each city with all its related weather values and then use collections.Counter:

from collections import Counter
d = [{'New York': 'cloudy', 'Chicago': 'snowy', 'Seattle': 'rainy'}, {'New York': 'cloudy', 'Chicago': 'hailing', 'Seattle': 'sunny'}, {'New York': 'sunny', 'Chicago': 'snowy', 'Seattle': 'rainy'}, {'New York': 'hailing', 'Chicago': 'snowy', 'Seattle': 'snowy'}]
weather = {i:Counter([c[i] for c in d]).most_common(1)[0][0] for b in d for i in b}

Output:

{'New York': 'cloudy', 'Chicago': 'snowy', 'Seattle': 'rainy'}

Edit: assuming all dictionaries in d contain the same keys, only the latter attributes need to be iterated over from the very first dictionary in the list:

weather = {i:Counter([c[i] for c in d]).most_common(1)[0][0] for i in d[0]}

Upvotes: 8

DYZ

Reputation: 57033

Convert your list of dictionaries into a pandas DataFrame, count values for each city, find the index of the maximal value:

import pandas as pd
typical = pd.DataFrame(your_list_of_dicts).apply(pd.value_counts).idxmax()
#Chicago      snowy
#New York    cloudy
#Seattle      rainy

Make it a dictionary if needed:

typical.to_dict()
#{'Chicago': 'snowy', 'New York': 'cloudy', 'Seattle': 'rainy'}

Upvotes: 2

Counting most common value in Arbitrary number of dictionaries

Answers (3)

Related Questions