Reputation: 4353
Given two dataframes:
idx_A | item_A | req_idx_B
1 'dog' 0
2 'cat' 3
3 'fish' 1
4 'weasel' 1
5 'wolf' 0
6 'dinosaur' 3
and
idx_B | item_B
1 'red'
2 'blue'
3 'green'
I want to append item_B
to all item_A
when req_idx_B
is the same as idx_B
. If req_idx_B = 0
, then the row should simply be ignored. So the result should be:
idx_A | item_A |
1 'dog'
2 'cat, green, dinosaur'
3 'fish, red, weasel'
4 'weasel, red, fish'
5 'wolf'
6 'dinosaur, cat, green'
The final format of the item_A column is a string and the order is completely irrelevant, as long as it contains all elements.
Notice that, since both 'cat' and 'dinosaur' have the same req_idx_B
, they are concatenated together with the corresponding element from the second dataframe.
Upvotes: 0
Views: 26
Reputation: 862661
Use GroupBy.transform
with join
by all rows not matched mask, add ,
and Series.map
ed values from second DataFrame and last repalce missing values by original values in item_A
:
s = test1[test1['req_idx_B'] != 0].groupby('req_idx_B')['item_A'].transform(', '.join)
mapping = test2.set_index('idx_B')['item_B']
test1['item_A'] = (s + ', ' + test1['req_idx_B'].map(mapping)).fillna(test1['item_A'])
print (test1)
idx_A item_A req_idx_B
0 1 'dog' 0
1 2 'cat', 'dinosaur', 'green' 3
2 3 'fish', 'weasel', 'red' 1
3 4 'fish', 'weasel', 'red' 1
4 5 'wolf' 0
5 6 'cat', 'dinosaur', 'green' 3
Upvotes: 1