Reputation: 601
Problem
I've got two dataframes;
I want to merge seller_df['price'] column to buyer_df based on the marginal quantity value (i.e. nearest largest match)
What I've tried
I found this SO Link which got me close. however it finds the nearest number match, not the nearest larger number match.
Example
# Seller
seller_df = pd.DataFrame({
'seller': ['A', 'B', 'C', 'A', 'B', 'C'],
'quantity': [40, 30, 60, 30, 80, 900],
'price': [3.33, 3.50, 3.95, 4.25, 4.99, 5.25],
'cum_avail': [40, 70, 130, 160, 240, 1140]
})
# Buyer
buyer_df = pd.DataFrame({
'time': ['t1', 't2', 't3', 't4', 't5', 't6', 't7', 't8', 't9', 't10'],
'quantity_required': [26, 563, 120, 233, 11, 198, 756, 64, 116, 224]
})
# What I've tried, based on SO link
S = seller_df.assign(key=1)
B = buyer_df.assign(key=1)
merged_SB = pd.merge(S, B, on='key', suffixes=('_S', '_B'))
M = merged_SB.groupby('cum_avail').apply(lambda x: abs(x['cum_avail']-x['quantity_required']) == abs(x['cum_avail']-x['quantity_required']).min())
merged_SB = merged_SB[M.values].drop_duplicates().drop('key', axis=1)
Desired Output
time quantity_required price
0 t1 26 3.33
1 t2 563 5.25
2 t3 120 3.95
3 t4 233 4.99
4 t5 11 3.33
5 t6 198 4.99
6 t7 756 5.25
7 t8 64 3.50
8 t9 116 3.95
9 t10 224 4.99
Upvotes: 1
Views: 41
Reputation: 153510
Let's use pd.merge_asof
:
pd.merge_asof(B.sort_values('quantity_required'), S,
right_on='cum_avail',
left_on='quantity_required', direction='forward' )[['time', 'quantity_required', 'price']]
Output:
time quantity_required price
0 t5 11 3.33
1 t1 26 3.33
2 t8 64 3.50
3 t9 116 3.95
4 t3 120 3.95
5 t6 198 4.99
6 t10 224 4.99
7 t4 233 4.99
8 t2 563 5.25
9 t7 756 5.25
Upvotes: 3