Silvia Sanna
Silvia Sanna

Reputation: 41

Sort one array by columns of another array - Python

I have two arrays, a and b, as follows:

a = array([[19.        ,  0.84722222],
           [49.        ,  0.86111111],
           [54.        ,  0.86666667],
           [42.        ,  0.9       ],
           [ 7.        ,  0.91111111],
           [46.        ,  0.99722222]])

b = array([[46.        ,  0.46944444],
       [49.        ,  0.59722222],
       [19.        ,  0.63611111],
       [42.        ,  0.72777778],
       [54.        ,  0.74722222],
       [ 7.        ,  0.98888889]])

I would like to sort b so that its first column matches the first column of array a. My output should be

b = array([[19.        ,  0.63611111],
           [49.        ,  0.59722222],
           [54.        ,  0.74722222],
           [42.        ,  0.72777778],
           [ 7.        ,  0.98888889]
           [46.        ,  0.46944444]])

Upvotes: 3

Views: 960

Answers (3)

Max Power
Max Power

Reputation: 8954

I think the most conceptually simple way to approach this is with a simple merge/join. Piggy-backing on your array definitions of a and b...

import pandas as pd

# convert arrays to Pandas DataFrames
df_a = pd.DataFrame(a, columns=['id', 'values_a'])
df_b = pd.DataFrame(b, columns=['id', 'values_b'])

# Merge in the values from b, into the table (and order) in a
df_merged = df_a.merge(df_b, how='left', on='id')

# Here's the two columns you want (in desired order) as a 2d numpy array via .values
answer = df_merged[['id', 'values_b']].values

...I find using DataFrames for these sorts of tasks makes everything clearer and debugging much easier whenever I encounter unexpected results

Upvotes: 0

Mad Physicist
Mad Physicist

Reputation: 114230

Conceptually you want to get the indices that will turn column zero of b into column zero of a. Imagine doing argsort on both. This will give you the indices to go from a or b to a sorted state. Now if you apply the inverse operation to the a index, it will tell you how to get from sorted back to a. As it happens, argsort is its own inverse. So I present you the following:

index = np.argsort(b[:, 0])[np.argsort(np.argsort(a[:, 0]))]
b = b[index, ...]

This is O(n log n) time complexity because of the three sorts. The other solutions here are O(n^2) since they perform a linear search for each index.

Upvotes: 6

BalrogOfMoria
BalrogOfMoria

Reputation: 134

I am supposing a and b have the same dimension and that the the first column contains the same set of elements.

import numpy as np

def same_order(a, b):
    new_pos = np.full(shape = a.shape[0], fill_value = -1)
    for i in range(new_pos.shape[0]):
        new_pos[np.where(a[:,0] == b[i,0])[0][0]] = i
    return b[new_pos]

Upvotes: -1

Related Questions