Reputation: 421
Consider two dataframes called "socio_demo" ([198 rows x 15 columns]) and UPDRS_sorted([198 rows x 70 columns]). Let's do:
socio_demo_sorted = socio_demo.sort_values(['NUMERO_CENTRE_1','NUMERO_INCLUSION_1'])
UPDRS_sorted = UPDRS.sort_values(['NUMERO_CENTRE_2','NUMERO_INCLUSION_2'])
UPDRS_sorted['NUMERO_CENTRE_2'] gives
Out[22]:
3 1
9 1
13 1
18 1
24 1
..
6 6
16 6
20 6
25 6
34 6
Name: NUMERO_CENTRE_2, Length: 198, dtype: int64
Now let's concatenate the two sorted datasets:
frames = [socio_demo_sorted,UPDRS_sorted]
full_data = pd.concat(frames,axis = 1)
which gives the expected [198 rows x 85 columns] shape. However, doing
full_data['NUMERO_CENTRE_2']
returns the original (non-sorted) UPDRS data:
0 3
1 4
2 2
3 1
4 5
..
193 1
194 1
195 1
196 1
197 1
Name: NUMERO_CENTRE_2, Length: 198, dtype: int64
I don't understand why the effect of the ".sort_values" function is lost here.
Upvotes: 2
Views: 715
Reputation: 23217
The row indexes of the original unsorted dataframes were retained after sorting (although they were shuffled after sorting). After you concat the 2 sorted dataframes, the concatenated dataframe was re-arranged based on these original indexes. Hence, returned to the unsorted orders.
You can solve this either by resetting index with .reset_index(drop=True)
of the sorted dataframes or directly by using parameter ignore_index=True
during the sort step:
Use either:
socio_demo_sorted = socio_demo.sort_values(['NUMERO_CENTRE_1','NUMERO_INCLUSION_1']).reset_index(drop=True)
UPDRS_sorted = UPDRS.sort_values(['NUMERO_CENTRE_2','NUMERO_INCLUSION_2']).reset_index(drop=True)
or by:
socio_demo_sorted = socio_demo.sort_values(['NUMERO_CENTRE_1','NUMERO_INCLUSION_1'], ignore_index=True)
UPDRS_sorted = UPDRS.sort_values(['NUMERO_CENTRE_2','NUMERO_INCLUSION_2'], ignore_index=True)
Then, concat as per your codes:
frames = [socio_demo_sorted,UPDRS_sorted]
full_data = pd.concat(frames,axis = 1)
Upvotes: 2
Reputation: 323266
In your case when concat
do ignore_index
out = pd.concat(frames,ignore_index=True)
Upvotes: 1