Reputation: 369
I have a dataframe which has a column with multiple values, separated by a comma like this:
Country
Australia, Cuba, Argentina
Australia
United States, Canada, United Kingdom, Argentina
I would like to count each unique value, similar to value_counts, like this:
Australia: 2
Cuba: 1
Argentina: 2
United States: 1
My simplest method is shown below, but I suspect that this can be done more efficiently and neatly.
from collections import Counter
Counter(pd.DataFrame(data['Country'].str.split(',', expand=True)).values.ravel())
Cheers
Upvotes: 2
Views: 2454
Reputation: 59274
Another option is to split
and then use value_counts
pd.Series(df.Country.str.split(', ').sum()).value_counts()
Argentina 2
Australia 2
United Kingdom 1
Canada 1
Cuba 1
United States 1
dtype: int64
Upvotes: 3
Reputation: 323226
You can using get_dummies
df.Country.str.get_dummies(sep=', ').sum()
Out[354]:
Argentina 2
Australia 2
Canada 1
Cuba 1
United Kingdom 1
United States 1
dtype: int64
Upvotes: 6