Reputation: 4301
I have a dataframe as follows:
Age Sex Score
18 M 25
20 F 30
19 M 22
18 M 30
20 F 27
18 M 28
I want to get the following:
Age Sex Score0 Score1 Score2
18 M 25 30 28
20 F 30 27 -1
19 M 22 -1 -1
My steps are:
def func(x):
x['score'] = ['Score' + str(i) for i in range(len(x))]
return x
df['key'] = df['Sex'] + df['Age'].astype(str)
dg = df.groupby(['Age', 'Sex']).apply(func)
dh =dg.pivot(index='key', columns = 'score', values = 'Score').reset_index().fillna(-1)
score key Score0 Score1 Score2
0 F20 30.0 27.0 -1.0
1 M18 25.0 30.0 28.0
2 M19 22.0 -1.0 -1.0
Are there any simpler methods?
Thanks.
Upvotes: 1
Views: 76
Reputation: 31001
Run:
df.groupby(['Age', 'Sex']).Score.apply(
lambda grp: grp.reset_index(drop=True))\
.unstack().fillna(-1, downcast='infer').add_prefix('Score')
Upvotes: 0
Reputation: 29635
you can do it with set_index
and unstack
df.set_index([(df['Sex']+df['Age'].astype(str)).rename('key'),
'Score'+df.groupby(['Sex','Age']).cumcount().astype(str)])['Score']\
.unstack(fill_value=-1).reset_index()
key Score0 Score1 Score2
0 F20 30 27 -1
1 M18 25 30 28
2 M19 22 -1 -1
Upvotes: 1
Reputation: 150785
You can use pivot_table
on two columns with assign
:
(df.assign(col=df.groupby(['Age','Sex']).cumcount().add(1))
.pivot_table(index=['Age','Sex'], columns='col',values='Score', fill_value=-1)
.add_prefix('Score')
.reset_index()
)
Output:
col Age Sex Score1 Score2 Score3
0 18 M 25 30 28
1 19 M 22 -1 -1
2 20 F 30 27 -1
Upvotes: 3