Reputation: 2385
I'm currently using a data frame, which has a column of type list (with strings) in each of its cells.
I'm interested in applying value.counts()
on it as if all the lists would have been concatinated into a single huge list (tried to do that, didn't work very well)
Toy example of the data structure that i have:
import pandas as pd
df_list = pd.DataFrame({'listcol':[['a','b','c'],['a','b','c']]})
print df_list
listcol
0 [a, b, c]
1 [a, b, c]
I would like to apply on it value.counts()
as it would have, if it was a big concatinated list as following:
#desired output:
df=pd.DataFrame(['a','b','c','a','b','c'])
df.columns = ['col']
df.col.value_counts() #desired output!
b 2
c 2
a 2
Thanks in advance!
Upvotes: 4
Views: 1002
Reputation: 862611
I think you need first create flatten list
and then apply Counter
, last create Series
:
from itertools import chain
from collections import Counter
print (Counter(chain.from_iterable(df_list['listcol'])))
Counter({'b': 2, 'a': 2, 'c': 2}
s = pd.Series(Counter(chain.from_iterable(df_list['listcol'])))
print (s)
a 2
b 2
c 2
dtype: int64
Or create Series
and use value_counts
:
#for python 2 omit list
s = pd.Series(list(chain.from_iterable(df_list['listcol'])))
print (s)
0 a
1 b
2 c
3 a
4 b
5 c
dtype: object
print (s.value_counts())
c 2
a 2
b 2
dtype: int64
Upvotes: 6