Reputation: 53
my original dataframe is like below:
A B C D
2 10 39 109
1 8 40 111
3 9 38 108
2 11 41 107
3 13 40 112
2 12 39 113
the output I desire is (merge rows based on column A values):
A B C D A1 B1 C1 D1 A2 B2 C2 D2
2 10 39 109 11 41 107 12 39 113
1 8 40 111 NA NA NA NA NA NA
3 9 38 108 13 40 112 NA NA NA
Upvotes: 3
Views: 266
Reputation: 863166
Use GroupBy.cumcount
with DataFrame.unstack
for reshape:
g = df.groupby('A').cumcount()
df1 = df.set_index(['A',g]).unstack().sort_index(level=1, axis=1)
df1.columns = [f'{a}{b}' if b != 0 else a for a, b in df1.columns]
df1 = df1.reset_index()
print (df1)
A B C D B1 C1 D1 B2 C2 D2
0 1 8.0 40.0 111.0 NaN NaN NaN NaN NaN NaN
1 2 10.0 39.0 109.0 11.0 41.0 107.0 12.0 39.0 113.0
2 3 9.0 38.0 108.0 13.0 40.0 112.0 NaN NaN NaN
df = df.apply(pd.Categorical)
g = df.groupby('A').cumcount()
df1 = df.set_index(['A',g]).unstack().sort_index(level=1, axis=1)
df1 = df1.apply(lambda x: x.cat.add_categories([0])).fillna(0)
df1.columns = [f'{a}{b}' if b != 0 else a for a, b in df1.columns]
df1 = df1.reset_index()
print (df1)
A B C D B1 C1 D1 B2 C2 D2
0 1 8 40 111 0 0 0 0 0 0
1 2 10 39 109 11 41 107 12 39 113
2 3 9 38 108 13 40 112 0 0 0
Upvotes: 4