rorance_
rorance_

Reputation: 369

Pandas - value_counts on multiple values in one cell

I have a dataframe which has a column with multiple values, separated by a comma like this:

Country
Australia, Cuba, Argentina
Australia
United States, Canada, United Kingdom, Argentina

I would like to count each unique value, similar to value_counts, like this:

Australia: 2
Cuba: 1
Argentina: 2
United States: 1

My simplest method is shown below, but I suspect that this can be done more efficiently and neatly.

from collections import Counter
Counter(pd.DataFrame(data['Country'].str.split(',', expand=True)).values.ravel())

Cheers

Upvotes: 2

Views: 2454

Answers (2)

rafaelc
rafaelc

Reputation: 59274

Another option is to split and then use value_counts

pd.Series(df.Country.str.split(', ').sum()).value_counts()

Argentina         2
Australia         2
United Kingdom    1
Canada            1
Cuba              1
United States     1
dtype: int64

Upvotes: 3

BENY
BENY

Reputation: 323226

You can using get_dummies

df.Country.str.get_dummies(sep=', ').sum()
Out[354]: 
Argentina         2
Australia         2
Canada            1
Cuba              1
United Kingdom    1
United States     1
dtype: int64

Upvotes: 6

Related Questions