Reputation: 57
I have a data frame that has a column of lists of strings, I want to find the number of occurrences of a string in the column.
i.e
samples subject trial_num
0 ['aa','bb'] 1 1
1 ['bb','cc'] 1 2
I want to get 2 for 'bb' 1 for 'aa' and 'cc'
Upvotes: 1
Views: 84
Reputation: 30940
Use:
df['samples'].explode().value_counts().to_dict()
#{'bb': 2, 'aa': 1, 'cc': 1}
Or without explode
:
pd.Series(np.concatenate(df['samples'])).value_counts().to_dict()
#{'bb': 2, 'aa': 1, 'cc': 1}
Solution only with numpy
dict(zip(*np.unique(np.concatenate(df['samples']), return_counts=True)))
#{'bb': 2, 'aa': 1, 'cc': 1}
Upvotes: 2