Ircbarros
Ircbarros

Reputation: 1053

Efficiently find matching indices and values for nested list of tuples

I am working with lists of different sizes and my question is:

Supposing I have two lists of tuples with different sizes:

value1 = [(0, 1), (0, 2), (0, 3)]
value2 = [(0, 6), (0, 2), (0, 4), (0, 9), (0, 7)]

Inserted in another list:

my_list = [value1, value2]

What is the most efficient way (preferably O(n)) to find the matching index when appending mylist with a third list and return then in order? The result should look something like:

value3 = [(0, 1), (0, 2), (0, 3), (0, 5), (0, 7), (0, 10)]

mathing_values (my_list, value3): 
    
    my_list.append(value3)

    return ->  "The List 'value3' has a matching with 'value1' in 
                index 0 : (0, 1), index 1: (0, 2) and with 'value2'
                in index 4: (0, 7)"

Obs: If it works for multiple lists (more than 3) it would be perfect

Upvotes: 0

Views: 135

Answers (2)

darkwing
darkwing

Reputation: 160

I don't think set() will work since you are wanting the indices, so this is not super efficient, but it will work:

def find_matches(my_list, new_list):
    indices = []
    values = []
    for index, (a, b) in enumerate(zip(my_list[0], my_list[1])):
        for new_value in new_list:
            if new_value == a or new_value == b:
                indices.append(index)
                values.append(new_value)
    if len(indices) == 0:
        return "No matches"
    else:
        return "Matching Values: {} Corresponding Indices: {}".format(values, indices)

Then just call the function:

print(find_matches(my_list, value3))

output:

Matching Values: [(0, 1), (0, 2), (0, 3)] Corresponding Indices: [0, 1, 2]

Here's the solution with pandas, which will be much faster and can include as many lists as you want. Hope this helps.

import pandas as pd

def find_matches(my_list, new_list):
    
    #create a dataframe from the lists contained in my_list
    dfs = []
    for l in my_list:
        l_series = pd.Series(l)
        l_df = pd.DataFrame(l_series)
        dfs.append(l_df)
    df = pd.concat(dfs, ignore_index=True, axis=1)
    
    #create second df of Boolean values if there are matches to the new_list
    df2 = df[df.isin(new_list)]
    df_final = df2.dropna(how='all')#drop rows where no matches were found in any list

    
    return df_final

calling:

find_matches(my_list, value3)

Return:

    0   1
0   (0, 1)  NaN
1   (0, 2)  (0, 2)
2   (0, 3)  NaN
4   NaN (0, 7)

Upvotes: 0

Kitsu
Kitsu

Reputation: 3445

I'm not sure that it's the most efficient way, but readable and straightforward:

v3 = set(value3)

[set(x).intersection(v3) for x in my_list]

UPD: an extended solution with indicies using a dict value as an index:

v3 = set(value3)

[(i, k) for x in my_list for (i, k) in enumerate(set(x)) if k in v3]

Upvotes: 1

Related Questions