Reputation: 1560
I have hundreds of dictionaries that look like this. They all have the same keys (New York, Chicago etc..) but have different values. There are no missing values.
[{'New York': 'cloudy', 'Chicago': 'snowy', 'Seattle': 'rainy'},
{'New York': 'cloudy', 'Chicago': 'hailing', 'Seattle': 'sunny'},
{'New York': 'sunny', 'Chicago': 'snowy', 'Seattle': 'rainy'},
{'New York': 'hailing', 'Chicago': 'snowy', 'Seattle':'snowy'}]
I want to count the most common "weather" value for each key. And then combine them all into one final list that just outputs each city with the most common key value it has.
{'New York': 'cloudy', 'Chicago': 'snowy', 'Seattle': 'rainy'}
How can I achieve this?
Upvotes: 2
Views: 44
Reputation: 71570
Why not a one-liner dictionary comprehension
then?, like:
print({k:max([i[k] for i in weather],key=[i[k] for i in weather].count) for k in list(weather[0].keys())})
Now you get the desired output as:
{'New York': 'cloudy', 'Chicago': 'snowy', 'Seattle': 'rainy'}
Upvotes: 0
Reputation: 71451
You can iterate over the list to group each city with all its related weather values and then use collections.Counter
:
from collections import Counter
d = [{'New York': 'cloudy', 'Chicago': 'snowy', 'Seattle': 'rainy'}, {'New York': 'cloudy', 'Chicago': 'hailing', 'Seattle': 'sunny'}, {'New York': 'sunny', 'Chicago': 'snowy', 'Seattle': 'rainy'}, {'New York': 'hailing', 'Chicago': 'snowy', 'Seattle': 'snowy'}]
weather = {i:Counter([c[i] for c in d]).most_common(1)[0][0] for b in d for i in b}
Output:
{'New York': 'cloudy', 'Chicago': 'snowy', 'Seattle': 'rainy'}
Edit: assuming all dictionaries in d
contain the same keys, only the latter attributes need to be iterated over from the very first dictionary in the list:
weather = {i:Counter([c[i] for c in d]).most_common(1)[0][0] for i in d[0]}
Upvotes: 8
Reputation: 57033
Convert your list of dictionaries into a pandas DataFrame, count values for each city, find the index of the maximal value:
import pandas as pd
typical = pd.DataFrame(your_list_of_dicts).apply(pd.value_counts).idxmax()
#Chicago snowy
#New York cloudy
#Seattle rainy
Make it a dictionary if needed:
typical.to_dict()
#{'Chicago': 'snowy', 'New York': 'cloudy', 'Seattle': 'rainy'}
Upvotes: 2