Creating a list column in a dataframe based on values in another dataframe

Question

I have two DataFrames:

df1:

       node        ids
0   ab          [978]
1   bc          [978, 121]

df2:

       name        id
0   alpha          978
1   bravo          121

I would like to add a new column called names in df1 where I get the list of names corresponding to ids column like this

   node            ids             names
0   ab            [978]            [alpha]
1   bc            [978, 121]       [alpha,bravo]

Would apprreciate help.

jezrael · Accepted Answer

Use if both id values are integers (or both strings, same types):

d = df2.set_index('id')['name'].to_dict()
df1['names'] = [[d.get(y) for y in x] for x in df1['ids']]
print (df1)
  node         ids           names
0   ab       [978]         [alpha]
1   bc  [978, 121]  [alpha, bravo]

If possible value in list not match value of df2['id'] is replaced some no match value:

d = df2.set_index('id')['name'].to_dict()
df1['names'] = [[d.get(y, 'no match') for y in x] for x in df1['ids']]
print (df1)
  node         ids              names
0   ab   [978, 10]  [alpha, no match]
1   bc  [978, 121]     [alpha, bravo]

Or is possible omit this values:

d = df2.set_index('id')['name'].to_dict()
df1['names'] = [[d[y] for y in x if y in d.keys()] for x in df1['ids']]
print (df1)
  node         ids           names
0   ab   [978, 10]         [alpha]
1   bc  [978, 121]  [alpha, bravo]

Creating a list column in a dataframe based on values in another dataframe

Answers (2)

Related Questions