Reputation: 584
Hi I have the following two pandas dataframes: df1 and df2.
I want to create a new dataframe, df3 such that it is the same as df1 but with one extra column called "new price".
The way I want new price to be populated is to return the first price with the same code from df2 that is greater than or equal to the price in df1.
Here are the dataframes:
df1:
Code Price
X 4.3
X 2.5
X 4
X 1.5
X 0.24
X 1
X 1.3
Y 3.9
Y 2.6
df2:
Code Price
X 0.5
X 1
X 1.5
X 2
X 2.5
X 3
X 3.5
X 4
X 4.5
X 5
X 5.5
Y 0.5
Y 1
Y 1.5
Y 2
Y 2.5
Y 3
Y 3.5
Y 4
Y 4.5
Y 5
Y 5.5
So as an example let is consider the first entry in df1
Code Price
X 4.3
So the column new price should look at all prices with code X in df2 and return the smallest price from df2 that is greater than or equal to 4.3.
In this case it is 4.5.
Repeat this for each line to get
df3:
Code Price New Price
X 4.3 4.5
X 2.5 2.5
...
Y 2.6 3
Does anyone know how to achieve this, I have tried pandas merge but that didn't work.
Upvotes: 2
Views: 153
Reputation: 75080
You can do a cross join and then query
, finally groupby().first()
:
m=(df1.assign(key=1).merge(df2.assign(key=1),on='key',suffixes=('','_y')).drop('key', 1)
.query("(Code==Code_y)&(Price<=Price_y)"))
m.groupby(['Code','Price'],sort=False)['Price_y'].first().reset_index(name='New Price')
Code Price New Price
0 X 4.30 4.5
1 X 2.50 2.5
2 X 4.00 4.0
3 X 1.50 1.5
4 X 0.24 0.5
5 X 1.00 1.0
6 X 1.30 1.5
7 Y 3.90 4.0
8 Y 2.60 3.0
Upvotes: 1