Reputation: 133
I have two Dataframes A and B. Both have same 4 columns. I want to merge the two data frames such that if first three column values match, then merge the id values(which is a jasonb array)
name age zip id
abc 25 11111 ["2722", "2855", "3583"]
name age zip id
abc 25 11111 ["123", "234"]
I want the final output to look like
name age zip id
----------------------------------------------------------------
abc 25 11111 ["2722", "2855", "3583", "123", "234"]
Upvotes: 1
Views: 43
Reputation: 402303
Another option is to merge
, then use a list comprehension to handle the "id" columns.
output = df_A.merge(df_B, on=['name', 'age', 'zip'])
output['id'] = [[*x, *y] for x, y in zip(output.pop('id_x'), output.pop('id_y'))]
output
name age zip id
0 abc 25 11111 [2722, 2855, 3583, 123, 234]
Upvotes: 1
Reputation: 323226
One quick solution will be
l=['name','age','zip']
df=(df1.set_index(l)+df2.set_index(l)).reset_index()
Upvotes: 1