Reputation: 2209
I have the following pandas dataframe:
colA colB
idcode
3223 s1 13
3223 s2 18
211 s1 24
211 s2 43
211 s3 198
where colA
and colB
are the columns and idcode
is the index for rows.
I would like to have as output a df as follows:
s1 s2 s3
idcode
3223 13 18 nan(or 0 but I'd prefer nan)
211 24 43 198
How can I do this?
Upvotes: 2
Views: 49
Reputation: 294488
Use set_index
and unstack
df.set_index('colA', append=True).colB.unstack()
colA s1 s2 s3
idcode
211 24.0 43.0 198.0
3223 13.0 18.0 NaN
To get rid of the colA
label on the columns
object.
df.set_index('colA', append=True).colB.unstack().rename_axis(None, 1)
s1 s2 s3
idcode
211 24.0 43.0 198.0
3223 13.0 18.0 NaN
What if idcode
/colA
pairs are not unique?
Then we have to aggregate.
pandas
0.20
groupby
# This could be any thing
# that aggregates, mean, sum, etc.
# /---\
df.groupby(['idcode', 'colA']).colB.first().unstack()
colA s1 s2 s3
idcode
211 24.0 43.0 198.0
3223 13.0 18.0 NaN
pivot_table
df.pivot_table('colB', index='idcode', columns='colA', aggfunc='first')
colA s1 s2 s3
idcode
211 24.0 43.0 198.0
3223 13.0 18.0 NaN
Upvotes: 2
Reputation: 7038
Here's another way using pivot()
:
df.pivot(columns='colA', values='colB')
colA s1 s2 s3
idcode
211 24.0 43.0 198.0
3223 13.0 18.0 NaN
Upvotes: 2