How to only use the first match when merging Pandas Dataframes?

Question

I have two dataframes: 'data' and 'working'. I am trying to merge data into working on the ID to get the Name into working

Data:
ID  Name
a1  a1_Name
a1  a1_Name1

Working:
ID  SomeValues 
a1  123
a1  456

I want the end result to match only on the first found ID each time, but as of now when I do something like

working = pd.merge(working, data, left_on="ID", right_on="ID", sort=False)

I get this:

Working:
ID  SomeValues Name 
a1  123        a1_Name
a1  456        a1_Name1

and it just alternates between 'a1_Name' and 'a1_Name1'.

I would like it to output:

Working:
ID  SomeValues Name 
a1  123        a1_Name
a1  456        a1_Name

Mayank Porwal · Accepted Answer

Like this:

In [3004]: Data['new_name'] = Data.groupby('ID')['Name'].transform('first')
In [3008]: Data.merge(Working, on='ID')[['ID','new_name','SomeValues']].drop_duplicates() 
Out[3008]: 
   ID new_name  SomeValues
0  a1  a1_Name         123
1  a1  a1_Name         456

How to only use the first match when merging Pandas Dataframes?

Answers (2)

Related Questions