lonyen11
lonyen11

Reputation: 93

Replace elements in list by column values of dataframe based on matching with another column in the dataframe

I have a problem that is tricky for me. I have a list of lists:

  list_lists = [['0', 'red/c/16', 'green/c/19', 'blue/c/16'],
 ['1', 'red/a/14', 'green/b/15', 'blue/c/25', 'green/c/28']]

and a dataframe:

data = [['red', 'ID=2', 'red/a'], ['green', 'ID=4', 'green/b'], 
        ['blue', 'ID=6', 'blue/c'], ['green', 'ID=7', 'green/c'],
        ['red', 'ID=8', 'red/c']]
df = pd.DataFrame(data, columns=['Color', 'ID', 'Startswith'])

I want to replace each element in list_lists with the respective element of the column 'ID', based on whether the beginning of the element in list_lists matches the element in the column 'Startswith'. How can this be done? Remark: The first element of each list is only the index of the list and can be ignored.

Upvotes: 1

Views: 887

Answers (1)

DJK
DJK

Reputation: 175

  1. Convert df to a dict with a format like this: {'red/a': {'ID': 'ID=2'}
  2. Format the string elements in list_lists to match the format of the values in Startswith (which are now the keys of your dict)
  3. Map each element to the corresponding'ID' value in the dict
id_lookup = df.set_index('Startswith')[['ID']].to_dict(orient = 'index')

mapped_lists = []
for i, list_ in enumerate(list_lists):
    mapped_lists.append(
        [i] + [id_lookup['/'.join(string.split('/')[:2])]['ID'] 
        for string in list_[1:]]
    )
    
# mapped_lists => [[0, 'ID=8', 'ID=7', 'ID=6'], [1, 'ID=2', 'ID=4', 'ID=6', 'ID=7']]

Upvotes: 1

Related Questions