user3747200
user3747200

Reputation: 105

pandas value_counts() with multiple values in list form?

I'm trying to do a value_count for a specific column in my dataframe

For example:

    <Fruit>
0   'apple'
1   'apple, orange'
2   'orange'

How do I sum it so it will count it even if it is in a list? So the above should give me:

'Apple'   2
'Orange'  2

I tried turning the string into a list, but not sure how to value_count over fields with a list of values.

Upvotes: 5

Views: 16014

Answers (2)

Jeff
Jeff

Reputation: 128948

This is a pandonic way

In [8]: s
Out[8]: 
0            apple
1    apple, orange
2           orange
dtype: object

Split the strings by their separators, turn them into Series and count them.

In [9]: s.str.split(',\s+').apply(lambda x: Series(x).value_counts()).sum()
Out[9]: 
apple     2
orange    2
dtype: float64

Upvotes: 6

joaquin
joaquin

Reputation: 85603

This is your dataframe:

df = p.DataFrame(['apple', 'apple, orange', 'orange'], columns= ['fruit'])

Then just join all your entries in the fruit column with a comma, eliminate extra spaces, and split again to have a list with all your fruits. Finally count them:

>>> from collections import Counter
>>> Counter(','.join(df['fruit']).replace(' ', '').split(','))

Counter({'orange': 2, 'apple': 2})

Upvotes: 0

Related Questions