Reputation: 113
I have two dataframes: 'data' and 'working'. I am trying to merge data into working on the ID to get the Name into working
Data:
ID Name
a1 a1_Name
a1 a1_Name1
Working:
ID SomeValues
a1 123
a1 456
I want the end result to match only on the first found ID each time, but as of now when I do something like
working = pd.merge(working, data, left_on="ID", right_on="ID", sort=False)
I get this:
Working:
ID SomeValues Name
a1 123 a1_Name
a1 456 a1_Name1
and it just alternates between 'a1_Name' and 'a1_Name1'.
I would like it to output:
Working:
ID SomeValues Name
a1 123 a1_Name
a1 456 a1_Name
Upvotes: 2
Views: 1800
Reputation: 75120
You can first drop duplicates and then map:
(working.assign(Name=working['ID']
.map(data.drop_duplicates('ID').set_index('ID')['Name'])))
ID SomeValues Name
0 a1 123 a1_Name
1 a1 456 a1_Name
Note: you have to assign this to the same dataframe or a new dataframe as you want.
Upvotes: 1
Reputation: 34086
Like this:
In [3004]: Data['new_name'] = Data.groupby('ID')['Name'].transform('first')
In [3008]: Data.merge(Working, on='ID')[['ID','new_name','SomeValues']].drop_duplicates()
Out[3008]:
ID new_name SomeValues
0 a1 a1_Name 123
1 a1 a1_Name 456
Upvotes: 3