Reputation: 175

Join pandas data frames based on columns and column of lists

I'm trying to join two data frames based on multiple columns. However, one of the conditions is not straight forward, because one column in one data frame exists in column of lists in the other data frame. As following

df_a :

df_b:

I want to join both data frames based on node,channel,value. However, value is inside a list in column trail

I have tried to create a dictionary from df_a to have the node and channel as keys, but that won't be unique, I added the value as a key but then I couldn't look that value up. Any tip to solve this issue is much appreciated

Upvotes: 2

Answers (1)

Scott Boston

Reputation: 153460

Update per comment by @JonClements if not always the first element try:

(df_b.assign(value=df_b['trail'].str.split(','))
     .explode('value')
     .merge(df_a, on=['node', 'channel', 'value']))

Try, if value is always the first element in trail:

import pandas as pd

df_a = pd.DataFrame({'node':['a']*4, 
                     'channel':[1,2,1,2],   
                     'wavelength':[555, 666, 555, 666], 
                     'value':['1-111-5', '1-222-6', '1-111-7', '1-222-8']})

df_b = pd.DataFrame({'node':['a']*4,
                     'channel':[1,2,1,2],
                     'trail':['1-111-5,1-a-5,2-r-3', 
                             '1-222-6,2-4-5,1-d-3', 
                             '1-111-7,1-a-2,1-f-t', 
                             '1-222-8,1-e-4'],
                    'power':[10,12,13,12]})

df_out = (df_b.assign(value=df_b['trail'].str.split(',').str[0])
              .merge(df_a, on=['node', 'channel', 'value']))
print(df_out)

Output:

  node  channel                trail  power    value  wavelength
0    a        1  1-111-5,1-a-5,2-r-3     10  1-111-5         555
1    a        2  1-222-6,2-4-5,1-d-3     12  1-222-6         666
2    a        1  1-111-7,1-a-2,1-f-t     13  1-111-7         555
3    a        2        1-222-8,1-e-4     12  1-222-8         666

Upvotes: 1

Join pandas data frames based on columns and column of lists

Answers (1)

Related Questions