Reputation: 109
I have this sample dataframe
ID1,ID2,info,Value
1,50,BA,101
1,50,BA,102
2,100,CA,03
2,100,CA,07
2,100,BA,09
2,100,BA,10
To reshape the data, I am using unstack and this is more like 90% which I want and it works.
grouped = df.groupby(['ID1','ID1','ID2','info'])[['Value']].apply(lambda x: pd.DataFrame(x.values, columns=[['Value']]))\
.unstack().sort_index(level=1, axis=1).reset_index(drop= True, level=1).reset_index()
grouped.columns = ['{}{}'.format(a,b) for a, b in df_grouped.columns]
and this is the result I have
ID1,ID2,info,Value0,Value1
1,50,BA,101,102
2,100,BA,9,10
2,100,CA,3,7
probably it would be more clearer if I can put all the ID1 in one row and if I have two different info value corresponding to the ID1, then I would like to achieve something similar like this
ID1,ID2,info,Value0,Value1,Value2,Value3
1,50,BA,101,102,,
2,100,[BA,CA],9,10,3,7
Any suggestions ?
Upvotes: 0
Views: 38
Reputation: 863176
Your solution should be changed by aggregate list
first, then if necessary only unique values of list convert values to sets and back to lists and from another columns create DataFrame which isjoin
ed to back:
grouped = df.groupby(['ID1','ID1','ID2']).agg(list)
grouped['info'] = grouped['info'].apply(lambda x: list(set(x)))
df1 = pd.DataFrame(grouped.pop('Value').values.tolist(),
index=grouped.index).add_prefix('Value')
grouped = grouped.join(df1)
print (grouped)
info Value0 Value1 Value2 Value3
ID1 ID1 ID2
1 1 50 [BA] 101 102 NaN NaN
2 2 100 [BA, CA] 3 7 9.0 10.0
Upvotes: 1