Mathews24
Mathews24

Reputation: 751

Sorting pairs in a list of varying dimension

I have a list with elements of varying size (some even empty) given by:

a1 = [array([[83, 84]]), array([[21, 24], [32, 53],[54, 56]]), array([[21,24],[32, 37],[45, 46]]), [], []]

In this list, values are either in pairs (within arrays), or simply empty. All I want to do is sort all the pairs in descending order based on their difference and retain their location/index in the original list (i.e. a1). For example, my desired output is:

a1_sorted = [[32,53],[32,37],[21,24],[21,24],[54,56],[45,46],[83,84],[],[]] a1_index = [[1,1],[2,1],[1,0],[2,0],[1,2],[2,2],[0,0],[3,0],[4,0]]

Since empty elements don't have a 2D location, the below indicating only the first index of the element is also a suitable alternative:

a1_index = [1,2,1,2,1,2,0,3,4]

Simply iterating over the list entries was my initial approach, but handling empty elements and varying dimension sizes have slowed this effort down. Any thoughts on optimal solutions?

Upvotes: 1

Views: 122

Answers (2)

blhsing
blhsing

Reputation: 106543

You can use enumerate to generate indices for the lists and sub-lists, then use list comprehension to produce the pairs coupled with their indices as tuples to be sorted together for output, and unpack to two different variables as needed:

a1_sorted, a1_index = zip(*sorted(((t, [i, j])
                      for i, l in enumerate(a1) for j, t in enumerate(list(l) or [[]])),
                      key=lambda t: -abs(t[0][1] - t[0][0]) if len(t[0]) else 0))

a1_sorted would become:

[[32, 53], [32, 37], [21, 24], [21, 24], [54, 56], [83, 84], [45, 46], [], []]

a1_index would become:

[[1, 1], [2, 1], [1, 0], [2, 0], [1, 2], [0, 0], [2, 2], [3, 0], [4, 0]]

Upvotes: 2

Khalil Al Hooti
Khalil Al Hooti

Reputation: 4506

you could try this code. However, could not obtain empty lists

import numpy as np
import pandas as pd

# the data
a1 =  [np.array([[83, 84]]), np.array([[21, 24], [32, 53],[54, 56]]), 
       np.array([[21,24],[32, 37],[45, 46]]), np.array([]), 
       np.array([])]

# create a data frame to store data in
df = pd.DataFrame(columns=['pair', 'index', 'difference']) 

for j, item in enumerate(a1): 
    a = item.ravel() # convert 2d array to 1d array
    for i in range(len(a)//2):
        difference = a[i*2+1] - a[i*2]
        pair = [a[i*2], a[i*2+1]]
        index = [j, np.where(np.all(item==pair,axis=1))[0]]

        df.loc[len(df)] = [pair, index, difference]

df.sort_values(by='difference', ascending=False, inplace=True) # sort based on diff

print(df)

       pair     index difference
2  [32, 53]  [1, [1]]         21
5  [32, 37]  [2, [1]]          5
1  [21, 24]  [1, [0]]          3
4  [21, 24]  [2, [0]]          3
3  [54, 56]  [1, [2]]          2
0  [83, 84]  [0, [0]]          1
6  [45, 46]  [2, [2]]          1

a1_sorted =  df['pair'].tolist()
print(a1_sorted)

[[32, 53], [32, 37], [21, 24], [21, 24], [54, 56], [83, 84], [45, 46]]

Upvotes: 0

Related Questions