Qubix
Qubix

Reputation: 4353

Merge elements from two dataframes based on given index

Given two dataframes:

idx_A | item_A       | req_idx_B
1        'dog'           0
2        'cat'           3
3        'fish'          1
4        'weasel'        1
5        'wolf'          0
6        'dinosaur'      3

and

idx_B | item_B      
1        'red'           
2        'blue'           
3        'green'    

I want to append item_B to all item_A when req_idx_B is the same as idx_B. If req_idx_B = 0, then the row should simply be ignored. So the result should be:

idx_A |          item_A          |
1        'dog'
2        'cat, green, dinosaur'
3        'fish, red, weasel'
4        'weasel, red, fish'
5        'wolf'
6        'dinosaur, cat, green'

The final format of the item_A column is a string and the order is completely irrelevant, as long as it contains all elements.

Notice that, since both 'cat' and 'dinosaur' have the same req_idx_B, they are concatenated together with the corresponding element from the second dataframe.

Upvotes: 0

Views: 26

Answers (1)

jezrael
jezrael

Reputation: 862661

Use GroupBy.transform with join by all rows not matched mask, add , and Series.maped values from second DataFrame and last repalce missing values by original values in item_A:

s  = test1[test1['req_idx_B'] != 0].groupby('req_idx_B')['item_A'].transform(', '.join)
mapping = test2.set_index('idx_B')['item_B']
test1['item_A'] = (s + ', ' + test1['req_idx_B'].map(mapping)).fillna(test1['item_A'])
print (test1)
   idx_A                      item_A  req_idx_B
0      1                       'dog'          0
1      2  'cat', 'dinosaur', 'green'          3
2      3     'fish', 'weasel', 'red'          1
3      4     'fish', 'weasel', 'red'          1
4      5                      'wolf'          0
5      6  'cat', 'dinosaur', 'green'          3

Upvotes: 1

Related Questions