Replace elements in list by column values of dataframe based on matching with another column in the dataframe

Question

I have a problem that is tricky for me. I have a list of lists:

  list_lists = [['0', 'red/c/16', 'green/c/19', 'blue/c/16'],
 ['1', 'red/a/14', 'green/b/15', 'blue/c/25', 'green/c/28']]

and a dataframe:

data = [['red', 'ID=2', 'red/a'], ['green', 'ID=4', 'green/b'], 
        ['blue', 'ID=6', 'blue/c'], ['green', 'ID=7', 'green/c'],
        ['red', 'ID=8', 'red/c']]
df = pd.DataFrame(data, columns=['Color', 'ID', 'Startswith'])

I want to replace each element in list_lists with the respective element of the column 'ID', based on whether the beginning of the element in list_lists matches the element in the column 'Startswith'. How can this be done? Remark: The first element of each list is only the index of the list and can be ignored.

DJK · Accepted Answer

Convert df to a dict with a format like this: {'red/a': {'ID': 'ID=2'}
Format the string elements in list_lists to match the format of the values in Startswith (which are now the keys of your dict)
Map each element to the corresponding'ID' value in the dict

id_lookup = df.set_index('Startswith')[['ID']].to_dict(orient = 'index')

mapped_lists = []
for i, list_ in enumerate(list_lists):
    mapped_lists.append(
        [i] + [id_lookup['/'.join(string.split('/')[:2])]['ID'] 
        for string in list_[1:]]
    )
    
# mapped_lists => [[0, 'ID=8', 'ID=7', 'ID=6'], [1, 'ID=2', 'ID=4', 'ID=6', 'ID=7']]

Replace elements in list by column values of dataframe based on matching with another column in the dataframe

Answers (1)

Related Questions