Reputation: 4301

How to simplify the use of pivot in pandas?

I have a dataframe as follows:

Age Sex Score
18  M   25
20  F   30
19  M   22
18  M   30
20  F   27
18  M   28

I want to get the following:

Age Sex Score0 Score1 Score2
18  M   25     30     28
20  F   30     27     -1
19  M   22     -1     -1

My steps are:

def func(x):
    x['score'] = ['Score' + str(i) for i in range(len(x))]
    return x

df['key'] = df['Sex']  + df['Age'].astype(str)
dg = df.groupby(['Age', 'Sex']).apply(func)
dh =dg.pivot(index='key', columns = 'score', values = 'Score').reset_index().fillna(-1)

score  key  Score0  Score1  Score2
0      F20    30.0    27.0    -1.0
1      M18    25.0    30.0    28.0
2      M19    22.0    -1.0    -1.0

Are there any simpler methods?

Thanks.

Upvotes: 1

Answers (3)

Valdi_Bo

Reputation: 31001

Run:

df.groupby(['Age', 'Sex']).Score.apply(
    lambda grp: grp.reset_index(drop=True))\
    .unstack().fillna(-1, downcast='infer').add_prefix('Score')

Upvotes: 0

Ben.T

Reputation: 29635

you can do it with set_index and unstack

df.set_index([(df['Sex']+df['Age'].astype(str)).rename('key'), 
              'Score'+df.groupby(['Sex','Age']).cumcount().astype(str)])['Score']\
  .unstack(fill_value=-1).reset_index()
   key  Score0  Score1  Score2
0  F20      30      27      -1
1  M18      25      30      28
2  M19      22      -1      -1

Upvotes: 1

Quang Hoang

Reputation: 150785

You can use pivot_table on two columns with assign:

(df.assign(col=df.groupby(['Age','Sex']).cumcount().add(1))
   .pivot_table(index=['Age','Sex'], columns='col',values='Score', fill_value=-1)
   .add_prefix('Score')
   .reset_index()
)

Output:

col  Age Sex  Score1  Score2  Score3
0     18   M      25      30      28
1     19   M      22      -1      -1
2     20   F      30      27      -1

Upvotes: 3

How to simplify the use of pivot in pandas?

Answers (3)

Related Questions