Reputation: 33
I have the following data frame :
parent 0 1 2 3
0 14026529 14062504 0 0 0
1 14103793 14036094 0 0 0
2 14025454 14036094 0 0 0
3 14030252 14030253 14062647 0 0
4 14034704 14086964 0 0 0
And I need this :
parent_id child_id
0 14026529 14062504
1 14025454 14036094
2 14030252 14030253
3 14030252 14062647
4 14103793 14036094
5 14034704 14086964
This is just a basic example, the real deal can have over 60 children.
Upvotes: 3
Views: 105
Reputation: 18647
Use DataFrame.where
, stack
and reset_index
.
Casting as Int64
first will prevent child_Id's being cast to floats during the stacking process.
(df.astype('Int64').where(df.ne(0))
.set_index('parent')
.stack()
.reset_index(level=0, name='child'))
[out]
parent child
0 14026529 14062504
0 14103793 14036094
0 14025454 14036094
0 14030252 14030253
1 14030252 14062647
0 14034704 14086964
Upvotes: 2