user9940344
user9940344

Reputation: 584

How to add a new column to a pandas df that returns the smallest value that is greater in the same group from another dataframe

Hi I have the following two pandas dataframes: df1 and df2.

I want to create a new dataframe, df3 such that it is the same as df1 but with one extra column called "new price".

The way I want new price to be populated is to return the first price with the same code from df2 that is greater than or equal to the price in df1.

Here are the dataframes:

df1:

Code    Price
X        4.3    
X        2.5    
X        4  
X        1.5    
X        0.24   
X        1  
X        1.3    
Y        3.9    
Y        2.6    

df2:

Code Price
X   0.5
X   1
X   1.5
X   2
X   2.5
X   3
X   3.5
X   4
X   4.5
X   5
X   5.5
Y   0.5
Y   1
Y   1.5
Y   2
Y   2.5
Y   3
Y   3.5
Y   4
Y   4.5
Y   5
Y   5.5

So as an example let is consider the first entry in df1

Code Price
X    4.3

So the column new price should look at all prices with code X in df2 and return the smallest price from df2 that is greater than or equal to 4.3.

In this case it is 4.5.

Repeat this for each line to get

df3:

Code    Price   New Price
X        4.3       4.5
X        2.5       2.5
...
Y         2.6       3

Does anyone know how to achieve this, I have tried pandas merge but that didn't work.

Upvotes: 2

Views: 153

Answers (1)

anky
anky

Reputation: 75080

You can do a cross join and then query , finally groupby().first():

m=(df1.assign(key=1).merge(df2.assign(key=1),on='key',suffixes=('','_y')).drop('key', 1)
                                            .query("(Code==Code_y)&(Price<=Price_y)"))
m.groupby(['Code','Price'],sort=False)['Price_y'].first().reset_index(name='New Price')

  Code  Price  New Price
0    X   4.30        4.5
1    X   2.50        2.5
2    X   4.00        4.0
3    X   1.50        1.5
4    X   0.24        0.5
5    X   1.00        1.0
6    X   1.30        1.5
7    Y   3.90        4.0
8    Y   2.60        3.0

Upvotes: 1

Related Questions