Halan
Halan

Reputation: 147

Pandas data frame concat return same data of first dataframe


I have this datafram

PNN_sh  NN_shap PNN_corr    NN_corr
1       25005   1           25005
2       25012   2           25001
3       25011   3           25009
4       25397   4           25445
5       25006   5           25205

Then I made 2 dataframs from this one.

NN_sh = data[['PNN_sh', 'NN_shap']]
NN_corr = data[['PNN_corr', 'NN_corr']]

Thereafter, I sorted them and saved in new dataframes.

NN_sh_sort = NN_sh.sort_values(by=['NN_shap'])
NN_corr_sort = NN_corr.sort_values(by=['NN_corr'])

Now I want to combine 2 columns from the 2 dataframs above.

all_pd = pd.concat([NN_sh_sort['PNN_sh'], NN_corr_sort['PNN_corr']], axis=1, join='inner')

But what I got is only the first column copied into second one also.

PNN_sh  PNN_corr
   1    1
   5    5
   3    3
   2    2
   4    4

The second column should be

PNN_corr
  2
  1
  3
  5
  4

Any idea how to fix it? Thanks in advance

Upvotes: 1

Views: 48

Answers (2)

Mike Tomaino
Mike Tomaino

Reputation: 170

I think when you sort you are preserving the original indices of the example DataFrames. Therefore, it is joining the PNN_corr value that was originally in the same row (at same index). Try resetting the index of each DataFrame after sorting, then join/concat.

NN_sh_sort = NN_sh.sort_values(by=['NN_shap']).reset_index()
NN_corr_sort = NN_corr.sort_values(by=['NN_corr']).reset_index()
all_pd = pd.concat([NN_sh_sort['PNN_sh'], NN_corr_sort['PNN_corr']], axis=1, join='inner')

Upvotes: 0

Andrej Kesely
Andrej Kesely

Reputation: 195438

Put ignore_index=True to sort_values():

NN_sh_sort = NN_sh.sort_values(by=['NN_shap'], ignore_index=True)
NN_corr_sort = NN_corr.sort_values(by=['NN_corr'], ignore_index=True)

Then the result after concat will be:

   PNN_sh  PNN_corr
0       1         2
1       5         1
2       3         3
3       2         5
4       4         4

Upvotes: 1

Related Questions