In python, i have a data frame like this Fruits James [Apple, Pear, Apple] Peter [Apple, Pear, Apple] I would like to get the count of both apple and pear. Would appreciate any help in this. Fruits Apple Pear James [Apple, Pear, Apple] 2 1 Peter [Apple, Pear, Apple] 2 1 I tried using this : d['Apple'] = (d.Fruits == 'Apple').sum() and d['Apple'] = (d.Fruits.values == 'Apple').sum()

Reputation: 79

Count string occurrence in a list in cell

In python, i have a data frame like this

	Fruits
James	[Apple, Pear, Apple]
Peter	[Apple, Pear, Apple]

I would like to get the count of both apple and pear. Would appreciate any help in this.

	Fruits	Apple	Pear
James	[Apple, Pear, Apple]	2	1
Peter	[Apple, Pear, Apple]	2	1

I tried using this :

d['Apple'] = (d.Fruits == 'Apple').sum() and
d['Apple'] = (d.Fruits.values == 'Apple').sum()

Upvotes: 3

Answers (4)

Ata Reenes

Reputation: 188

For any list you can use Collections.Counter() it works with an easy logic such as Counter(item)You can loop your entire list and counter your item it will give your output.

Upvotes: 1

anky

Reputation: 75080

You can use df.explode and groupby.value_counts with unstack:

out = (df.join(df['Fruits'].explode().groupby(level=0).value_counts()
         .unstack(fill_value=0)))

print(out)

                     Fruits  Apple  Pear
James  [Apple, Pear, Apple]      2     1
Peter  [Apple, Pear, Apple]      2     1

Upvotes: 2

Dani Mesejo

Reputation: 61910

Use value_counts + concat:

res = pd.concat((df, df['Fruits'].apply(pd.Series.value_counts)), 1)
print(res)

Output

                     Fruits  Apple  Pear
James  [Apple, Pear, Apple]      2     1
Peter  [Apple, Pear, Apple]      2     1

A more general approach is to do:

res = pd.concat((df, df['Fruits'].apply(pd.Series.value_counts).fillna(0)), 1)
print(res)

Upvotes: 2

jezrael

Reputation: 862581

Solution if performance is important and need count all values:

from collections import Counter

df = df.join(pd.DataFrame([Counter(x) for x in df.Fruits.to_numpy()], index=df.index))
print (df)
                     Fruits  Apple  Pear
James  [Apple, Pear, Apple]      2     1
Peter  [Apple, Pear, Apple]      2     1

If want test values sepately:

df['Apple'] = df.Fruits.apply(lambda x: sum(y == 'Apple' for y in x))
df['Pear'] = df.Fruits.apply(lambda x: sum(y == 'Pear' for y in x))

Upvotes: 2

Count string occurrence in a list in cell

Answers (4)

Related Questions