Reputation: 77
I have a dataframe containing a column of lists lie the following:
df
pos_tag
0 ['Noun','verb','adjective']
1 ['Noun','verb']
2 ['verb','adjective']
3 ['Noun','adverb']
...
what I would like to get is the number of time each unique element occurred in the overall column as a dictionary:
desired output:
my_dict = {'Noun':3, 'verb':3, 'adjective':2, 'adverb':1}
Upvotes: 0
Views: 115
Reputation: 862511
For improve performance use Counter
with flatten values of nested lists:
from collections import Counter
my_dict = dict(Counter([y for x in df['pos_tag'] for y in x]))
print (my_dict)
{'Noun': 3, 'verb': 3, 'adjective': 2, 'adverb': 1}
Upvotes: 1
Reputation: 71689
Use, Series.explode
along with Series.value_counts
and Series.to_dict
:
freq = df['pos_tag'].explode().value_counts().to_dict()
Result:
# print(freq)
{'Noun':3, 'verb':3, 'adjective':2, 'adverb':1}
Upvotes: 2