Reputation: 146
For example, I have a dataframe like this:
data = {'id': [1,1,1,2,2],
'value': ['red','red and blue','yellow','oak','oak wood']
}
df = pd.DataFrame (data, columns = ['id','value'])
I want :
id value count
1 red 2
1 blue 1
1 yellow 1
2 oak 2
2 wood 1
Many thanks!
Upvotes: 2
Views: 56
Reputation: 862406
Solution for pandas 0.25+ with DataFrame.explode
by lists created by Series.str.split
and GroupBy.size
:
df1 = (df.assign(value = df['value'].str.split())
.explode('value')
.groupby(['id','value'], sort=False)
.size()
.reset_index(name='count'))
print (df1)
id value count
0 1 red 2
1 1 and 1
2 1 blue 1
3 1 yellow 1
4 2 oak 2
5 2 wood 1
For lower pandas versions use DataFrame.set_index
with Series.str.split
and expand=True
for DataFrame
, reshape by DataFrame.stack
, create columns from MultiIndex Series
ands use same solution like above:
df1 = (df.set_index('id')['value']
.str.split(expand=True)
.stack()
.reset_index(name='value')
.groupby(['id','value'], sort=False)
.size()
.reset_index(name='count')
)
print (df1)
id value count
0 1 red 2
1 1 and 1
2 1 blue 1
3 1 yellow 1
4 2 oak 2
5 2 wood 1
Upvotes: 4