How to sort a list a string of two list path in python?

Question

I have two list that contains the path of files

lst_A =['/home/data_A/test_AA_123.jpg',
        '/home/data_A/test_AB_234.jpg',
        '/home/data_A/test_BB_321.jpg',
        '/home/data_A/test_BC_112.jpg',
       ]

lst_B =['/home/data_B/test_AA_222.jpg',
        '/home/data_B/test_CC_444.jpg',
        '/home/data_B/test_AB_555.jpg',
        '/home/data_B/test_BC_777.jpg',
       ]

Based on the lst_A, I want to sort the list B so that the first and second name of basename of two path in A and B should be same. In this case is test_xx. So, the expected short list B is

lst_B =['/home/data_B/test_AA_222.jpg',
        '/home/data_B/test_AB_555.jpg',
        '/home/data_B/test_CC_444.jpg',
        '/home/data_B/test_BC_777.jpg',
       ]

In additions, I want to indicate which position of two lists have first and second name are same in the basename (such as test_xx), so the array indicator should be

array_same =[1,1,0,1]

How should I do it in python? I have tried the .sort() function but it returns unexpected result. Thanks

Update: This is my solution

import os
lst_A =['/home/data_A/test_AA_123.jpg',
        '/home/data_A/test_AB_234.jpg',
        '/home/data_A/test_BB_321.jpg',
        '/home/data_A/test_BC_112.jpg',
       ]

lst_B =['/home/data_B/test_AA_222.jpg',
        '/home/data_B/test_CC_444.jpg',
        '/home/data_B/test_AB_555.jpg',
        '/home/data_B/test_BC_777.jpg']

lst_B_sort=[]
same_array=[]
for ind_a, a_name in enumerate(lst_A):
  for ind_b, b_name in enumerate(lst_B):
    print (os.path.basename(b_name).split('_')[1])
    if os.path.basename(b_name).split('_')[1] in os.path.basename(a_name):
        lst_B_sort.append(b_name)
        same_array.append(1)
print(lst_B_sort)
print(same_array)

Output: ['/home/data_B/test_AA_222.jpg', '/home/data_B/test_AB_555.jpg', '/home/data_B/test_BC_777.jpg']

[1, 1, 1]

Because I did not add the element that has not same name

pylang · Accepted Answer

We will discuss the issue with a SIMPLE technique followed by an APPLIED solution.

SIMPLE

We just focus on sorting the names given a key.

Given

Simple names and a key list:

lst_a = "AA AB BB BC EE".split()
lst_b = "AA DD CC AB BC".split()

key_list = [1, 1, 0, 1, 0]

Code

same = sorted(set(lst_a) & set(lst_b))
diff = sorted(set(lst_b) - set(same))

isame, idiff = iter(same), iter(diff)
[next(isame) if x else next(idiff) for x in key_list]
# ['AA', 'AB', 'CC', 'BC', 'DD']

lst_b gets sorted according to elements shared with lst_a first. Remnants are inserted as desired.

Details

This problem is mainly reduced to sorting the intersection of names from both lists. The intersection is a set of common elements called same. The remnants are in a set called diff. We sort same and diff and here's what they look like:

same
# ['AA', 'AB', 'BC']
diff
# ['CC', 'DD']

Now we just want to pull a value from either list, in order, according to the key. We start by iterating the key_list. If 1, pull from the isame iterator. Otherwise, pull from idiff.

Now that we have the basic technique, we can apply it to the more complicated path example.

APPLIED

Applying this idea to more complicated path-strings:

Given

import pathlib 


lst_a = "foo/t_AA_a.jpg foo/t_AB_a.jpg foo/t_BB_a.jpg foo/t_BC_a.jpg foo/t_EE_a.jpg".split()
lst_b = "foo/t_AA_b.jpg foo/t_DD_b.jpg foo/t_CC_b.jpg foo/t_AB_b.jpg foo/t_BC_b.jpg".split()

key_list = [1, 1, 0, 1, 0]

# Helper
def get_name(s_path):
    """Return the shared 'name' from a string path.

    Examples
    --------
    >>> get_name("foo/test_xx_a.jpg")
    'test_xx'

    """
    return pathlib.Path(s_path).stem.rsplit("_", maxsplit=1)[0]

Code

Map the names to paths:

name_path_a = {get_name(p): p for p in lst_a}
name_path_b = {get_name(p): p for p in lst_b}

Names are in dict keys, so directly substitute sets with dict keys:

same = sorted(name_path_a.keys() & name_path_b.keys())
diff = sorted(name_path_b.keys() - set(same))

isame, idiff = iter(same), iter(diff)

Get the paths via names pulled from iterators:

[name_path_b[next(isame)] if x else name_path_b[next(idiff)] for x in key_list]

Output

['foo/t_AA_b.jpg',
 'foo/t_AB_b.jpg',
 'foo/t_CC_b.jpg',
 'foo/t_BC_b.jpg',
 'foo/t_DD_b.jpg']

How to sort a list a string of two list path in python?

Answers (2)

Related Questions