Reputation:
I have the following pandas dataframe.
ID Col1 Col2 Col3 Col4 Col5 Col6 Col7 Col8 Col9
1 A B C A B C A B C
2 D E F D E F D E F
I would like to get the following table by restructuring every three-column values.
ID Col_1 Col_2 Col_3
1 A B C
A B C
A B C
2 D E F
D E F
D E F
OR
ID Col_1 Col_2 Col_3
1 A B C
1 A B C
1 A B C
2 D E F
2 D E F
2 D E F
Is there any efficient way to do it? I tried to find similar examples here at the StackOverflow community but couldn't. If you have, you can point me.
Any help is appreciated!
Upvotes: 2
Views: 99
Reputation: 75100
Here is one with apply
and np.reshape
:
f = lambda x: pd.DataFrame(np.reshape(x.to_numpy(),(-1,3))).add_prefix('Col_')
df.groupby('ID').apply(f).reset_index('ID')
ID Col_0 Col_1 Col_2
0 1 A B C
1 1 A B C
2 1 A B C
3 2 D E F
4 2 D E F
5 2 D E F
Upvotes: 1
Reputation: 319
I'm not sure about efficient, but it works:
k = 3 # every k column
pd.DataFrame([df.values[:,[n,n+k,n+k]].flatten() for n in range(k)]).T
0 1 2
0 A B C
1 A B C
2 A B C
3 D E F
4 D E F
5 D E F
Upvotes: 1
Reputation: 863166
You ca use DataFrame.stack
with MultiIndex
by all column without ID
created by integer and modulo division:
df = df.set_index('ID')
c = np.arange(len(df.columns))
df.columns = [c // 3, c % 3]
df1 = (df.stack()
.reset_index(level=1, drop=True)
.rename(columns= lambda x: f'Col_{x+1}')
.reset_index())
print (df1)
ID Col_1 Col_2 Col_3
0 1 A A A
1 1 B B B
2 1 C C C
3 2 D D D
4 2 E E E
5 2 F F F
Upvotes: 5