Reputation: 27
Sounds menacing but I'll make it clear.
I have 2 dataframes:
df1
Group Target Sales
0 A 13432 5756
1 A 13443 17083
2 A 13462 17635
3 A 13501 10266
4 A 13377 18375
5 A 13417 13963
6 A 13448 16056
11 B 13539 13020
12 B 13567 18171
13 B 14501 12135
14 B 13608 12399
15 B 13578 7419
16 B 13555 9083
21 C 13556 12121
22 C 13576 11633
23 C 13693 23983
24 C 12501 17222
25 C 13670 10547
26 C 13614 10945
27 C 13497 10411
df2
Group NewTarget
0 A 13500
1 B 14500
2 C 12500
The common column is Group.
I'd like to do a lookup of sorts which looks up the 'NewTarget' value in df2 against the 'Target' value in df1 and returns the closest 'Sales' value as a new column 'New' in df1 by group. So the result would look like this:
df3
Group Target Sales New
0 A 13432 5756 10266
1 A 13443 17083 10266
2 A 13462 17635 10266
3 A 13501 10266 10266
4 A 13377 18375 10266
5 A 13417 13963 10266
6 A 13448 16056 10266
11 B 13539 13020 12135
12 B 13567 18171 12135
13 B 14501 12135 12135
14 B 13608 12399 12135
15 B 13578 7419 12135
16 B 13555 9083 12135
21 C 13556 12121 17222
22 C 13576 11633 17222
23 C 13693 23983 17222
24 C 12501 17222 17222
25 C 13670 10547 17222
26 C 13614 10945 17222
27 C 13497 10411 17222
Note I have many other numeric columns in df1, but trying to keep as simple as possible.
Appreciate the help.
Many thanks in advance.
Upvotes: 0
Views: 734
Reputation: 323356
Check with panda merge_asof
out1 = pd.merge_asof(df2.sort_values('NewTarget'),
df1.sort_values('Target'),
left_on = 'NewTarget',
right_on='Target',
by='Group',
direction = 'nearest')
out = df1.merge(out1.loc[:,['Group','Sales']],on='Group')
Out[615]:
Group Target Sales_x Sales_y
0 A 13432 5756 10266
1 A 13443 17083 10266
2 A 13462 17635 10266
3 A 13501 10266 10266
4 A 13377 18375 10266
5 A 13417 13963 10266
6 A 13448 16056 10266
7 B 13539 13020 12135
8 B 13567 18171 12135
9 B 14501 12135 12135
10 B 13608 12399 12135
11 B 13578 7419 12135
12 B 13555 9083 12135
13 C 13556 12121 17222
14 C 13576 11633 17222
15 C 13693 23983 17222
16 C 12501 17222 17222
17 C 13670 10547 17222
18 C 13614 10945 17222
19 C 13497 10411 17222
Upvotes: 0
Reputation: 863301
Use:
df = df1.merge(df2, on='Group', how='left')
df['NewTarget'] = (df.set_index('Sales')
.assign(diff = lambda x: x['Target'].sub(x['NewTarget']).abs())
.groupby('Group')['diff']
.transform('idxmin')
.to_numpy())
print (df)
Group Target Sales NewTarget
0 A 13432 5756 10266
1 A 13443 17083 10266
2 A 13462 17635 10266
3 A 13501 10266 10266
4 A 13377 18375 10266
5 A 13417 13963 10266
6 A 13448 16056 10266
7 B 13539 13020 12135
8 B 13567 18171 12135
9 B 14501 12135 12135
10 B 13608 12399 12135
11 B 13578 7419 12135
12 B 13555 9083 12135
13 C 13556 12121 17222
14 C 13576 11633 17222
15 C 13693 23983 17222
16 C 12501 17222 17222
17 C 13670 10547 17222
18 C 13614 10945 17222
19 C 13497 10411 17222
Upvotes: 1