Reputation: 3907
I am looking to do a particular operation on a pandas DataFrame
using python3. I want to collapse a NxK DataFrame
into a NKx3 DataFrame
which consists of three columns: the entry, the column and the index from the original DataFrame
. Here is an example:
'a' 'b' 'c'
'e' 1 2 3
'f' 4 5 6
Desired output:
0 1 2
0 1 'a' 'e'
1 4 'a' 'f'
2 2 'b' 'e'
3 5 'b' 'f'
4 3 'c' 'e'
5 6 'c' 'f'
I am looking for a pythonic elegant way to achieve this, but as I am dealing with very large dataframes, the highest priority is efficiency.
Upvotes: 1
Views: 864
Reputation: 294508
pandas
use unstack
+ reset_index
df.unstack().reset_index()
level_0 level_1 0
0 a e 1
1 a f 4
2 b e 2
3 b f 5
4 c e 3
5 c f 6
replicate exactly what you have
df.unstack().rename_axis([1, 2]).reset_index().sort_index(1)
0 1 2
0 1 a e
1 4 a f
2 2 b e
3 5 b f
4 3 c e
5 6 c f
numpy
v = df.values
pd.DataFrame({
0: v.ravel('F'),
1: df.columns.values.repeat(v.shape[0]),
2: np.tile(df.index.values, v.shape[1])
})
0 1 2
0 1 a e
1 4 a f
2 2 b e
3 5 b f
4 3 c e
5 6 c f
Upvotes: 6