Reputation: 455
I have the below data frame:
df = pd.DataFrame([['A',1,2],['B',1,5],['A',2,5],['C',1,8],['A',1,5]], columns = ['person','status','result'])
person status result
0 A 1 2
1 B 1 5
2 A 2 5
3 C 1 8
4 A 1 5
My aim: I need to make person and status as index and check the result of each index. However, if there is index duplicate, I want to keep it such that person A with status 1 has 2 results 2 and 5
status 1 2
person
A 2.0 5.0
A 5.0
B 5.0 NaN
C 8.0 NaN
My attempt:
df1.set_index(['person','status'])['result'].unstack()
But this does not work as duplicate of index is not allowed.
Upvotes: 1
Views: 101
Reputation: 28709
Bit of a hack, here goes:
df = pd.DataFrame([['A',1,2],['B',1,5],['A',2,5],['C',1,8],['A',1,5]], columns
= ['person','status','result'])
df
person status result
0 A 1 2
4 A 1 5
2 A 2 5
1 B 1 5
3 C 1 8
M=(df
.assign(result= lambda x: x.result.astype('str'))
.groupby(['person','status'],as_index=False)
.agg(lambda x: x.str.cat(sep=','))
.pivot(index='person',columns='status')
)
#https://stackoverflow.com/a/55757002/7175713
M.columns = M.columns.to_flat_index()
(M.reset_index()
.assign(exp = lambda x: x[('result',1)].str.split(','))
.explode('exp')
.drop(('result',1),axis=1)
.rename(columns={('result',2):'2','exp':'1'})
.reset_index(drop=True)
.loc[:,['person','1','2']]
)
person 1 2
0 A 2 5
1 A 5 5
2 B 5 NaN
3 C 8 NaN
Upvotes: 1