Sos
Sos

Reputation: 1949

Sorting multiple columns based on 2 columns in pandas dataframe

I have a dataframe with multiple alphabetical values which I want to sort. For instance

ii     A.1     A.2      B.1     B.2
1      Xy      foo      Ly      bar
2      Ab      bar      Ko      foo

So I'd like to sort each row according to A.1 and B.1, and reorder A.2 and B.2 according to that order. This would become:

ii     s1      s2       b1      b2
1      Ly      bar      Xy      foo
2      Ab      bar      Ko      foo

I am trying to use df.apply(lambda x: x.sort_values()). However, I am having problems changing the order of the additional columns (A.2 and B.2). How would you do this?

Edit: to clarify, I need to sort A.2 B.2 according to the order specified by the sorted A.1 and B.1. For instance:

ii     A.1     A.2      B.1     B.2
1      Xy      mat      Ly      bar
2      Ab      zul      Ko      foo #shouldn't change

becomes:

ii     A.1     A.2      B.1     B.2
1      Ly      bar      Xy      mat
2      Ab      zul      Ko      foo #notice, this is unchanged because A.1 B.1 are already sorted 

Upvotes: 2

Views: 313

Answers (1)

jezrael
jezrael

Reputation: 862591

I believe need numpy.argsort for positions by sorted array and then get values by indices in arr and assign back:

arr = df[['A.1', 'B.1']].values.argsort()
print (arr)
[[1 0]
 [0 1]]

df[['A.1', 'B.1']] = df[['A.1', 'B.1']].values[np.arange(len(arr))[:,None], arr]
df[['A.2', 'B.2']] = df[['A.2', 'B.2']].values[np.arange(len(arr))[:,None], arr]
print (df)
   ii A.1  A.2 B.1  B.2
0   1  Ly  bar  Xy  foo
1   2  Ab  bar  Ko  foo

With new data:

print (df)
   ii A.1  A.2 B.1  B.2
0   1  Ly  bar  Xy  mat
1   2  Ab  zul  Ko  foo

Upvotes: 2

Related Questions