Reputation: 147
I have this datafram
PNN_sh NN_shap PNN_corr NN_corr
1 25005 1 25005
2 25012 2 25001
3 25011 3 25009
4 25397 4 25445
5 25006 5 25205
Then I made 2 dataframs from this one.
NN_sh = data[['PNN_sh', 'NN_shap']]
NN_corr = data[['PNN_corr', 'NN_corr']]
Thereafter, I sorted them and saved in new dataframes.
NN_sh_sort = NN_sh.sort_values(by=['NN_shap'])
NN_corr_sort = NN_corr.sort_values(by=['NN_corr'])
Now I want to combine 2 columns from the 2 dataframs above.
all_pd = pd.concat([NN_sh_sort['PNN_sh'], NN_corr_sort['PNN_corr']], axis=1, join='inner')
But what I got is only the first column copied into second one also.
PNN_sh PNN_corr
1 1
5 5
3 3
2 2
4 4
The second column should be
PNN_corr
2
1
3
5
4
Any idea how to fix it? Thanks in advance
Upvotes: 1
Views: 48
Reputation: 170
I think when you sort you are preserving the original indices of the example DataFrames. Therefore, it is joining the PNN_corr value that was originally in the same row (at same index). Try resetting the index of each DataFrame after sorting, then join/concat.
NN_sh_sort = NN_sh.sort_values(by=['NN_shap']).reset_index()
NN_corr_sort = NN_corr.sort_values(by=['NN_corr']).reset_index()
all_pd = pd.concat([NN_sh_sort['PNN_sh'], NN_corr_sort['PNN_corr']], axis=1, join='inner')
Upvotes: 0
Reputation: 195438
Put ignore_index=True
to sort_values()
:
NN_sh_sort = NN_sh.sort_values(by=['NN_shap'], ignore_index=True)
NN_corr_sort = NN_corr.sort_values(by=['NN_corr'], ignore_index=True)
Then the result after concat will be:
PNN_sh PNN_corr
0 1 2
1 5 1
2 3 3
3 2 5
4 4 4
Upvotes: 1