francisco socías
francisco socías

Reputation: 39

filter pandas dataframe based in another column

this might be a basic question, but I have not being able to find a solution. I have two dataframes, with identical rows and columns, called Volumes and Prices, which are like this

Volumes

Index    ProductA    ProductB     ProductC     ProductD    Limit
0          100          300          400           78       100
1          110          370           20           30       100
2           90          320          200          121       100
3          150          320          410           99       100
....

Prices

Index    ProductA    ProductB     ProductC     ProductD    Limit
0           50          110          30           90        0
1           51          110          29           99        0
2           49          120          25           88        0
3           51          110          22           96        0
....

I want to assign 0 to the "cell" of the Prices dataframe which correspond to Volumes less than what it is on the Limit column

so, the ideal output would be

Prices

Index    ProductA    ProductB     ProductC     ProductD    Limit
0           50          110          30            0         0
1           51          110           0            0         0
2            0          120          25           88         0
3           51          110          22            0         0
....

I tried

import pandas as pd
import numpy as np
d_price = {'ProductA' : [50, 51, 49, 51], 'ProductB' : [110,110,120,110], 
'ProductC' : [30,29,25,22],'ProductD' : [90,99,88,96], 'Limit': [0]*4}
d_volume = {'ProductA' : [100,110,90,150], 'ProductB' : [300,370,320,320], 
'ProductC' : [400,20,200,410],'ProductD' : [78,30,121,99], 'Limit': [100]*4}
Prices = pd.DataFrame(d_price)
Volumes = pd.DataFrame(d_volume)

Prices[Volumes > Volumes.Limit]=0

but I do not obtain any changes to the Prices dataframe... obviously I'm having a hard time understanding boolean slicing, any help would be great

Upvotes: 1

Views: 2533

Answers (2)

hatipa
hatipa

Reputation: 26

you can use mask to solve this problem, I am not an expert either but this solutions does what you want to do.

test=(Volumes.ix[:,'ProductA':'ProductD'] >= Volumes.Limit.values)
final = Prices[test].fillna(0)

Upvotes: 0

Jihun
Jihun

Reputation: 1485

The problem is in

Prices[Volumes > Volumes.Limit]=0

Since Limit varies on each row, you should use, for example, apply like following:

Prices[Volumes.apply(lambda x : x>x.Limit, axis=1)]=0

Upvotes: 1

Related Questions