Amit
Amit

Reputation: 7035

Count occurence of string in a series type column of data frame in Python

I have a column in data frame which looks like below

enter image description here

How do i calculate frequency of each word. For ex: The word 'doorman' appears in 4 rows so i need the word along with its frequency i.e doorman = 4. This needs to be done for each and every word.

Please advise

Upvotes: 1

Views: 197

Answers (1)

jezrael
jezrael

Reputation: 862771

I think you can first flat list of lists in column and then use Counter:

df = pd.DataFrame({'features':[['a','b','b'],['c'],['a','a']]})

print (df)
    features
0  [a, b, b]
1        [c]
2     [a, a]

from  itertools import chain
from collections import Counter

print (Counter(list(chain.from_iterable(df.features))))
Counter({'a': 3, 'b': 2, 'c': 1})

Upvotes: 3

Related Questions