water77
water77

Reputation: 123

For loop with multiple IF conditions in Pandas

I want to make for loop with the condition over columns in panda DataFrame:

import numpy as np  
import pandas as pd


df=pd.DataFrame(pd.read_csv("data.csv"))  
print df  

DWWC1980     DWWC1985   DWWC1990  
16.7140310  16.35661439 15.89201716  
20.9414479  18.00822799 15.73516051  
33.95022337 51.87065104 73.76376497  
144.7000805 136.1462017 130.9143924  
54.9506033  75.03339188 93.22994974  

For loop condition statement:

for i in range (1980,2015,5):

    if   any(df["DWWC"+str(i)] <=18.25)  :

            df['MWTP'+str(i)]=(((10-33)/(5))*(df["DWWC"+str(i)]-5))+10  

    elif any((df["DWWC"+str(i)] >  18.25) &  (df["DWWC"+str(i)] <= 36.5)) :

            df['MWTP'+str(i)]=((10/(df.two-df.three))*(df["DWWC"+str(i)]-df.three))+df.Three

    else :
            df['MWTP'+str(i)]=(((df.Three_value-6)/(df.three-5))*(df["DWWC"+str(i)]-6  

df.to_csv('MWTP1.csv',index='ISO3')

But When I run this code and compare with manual calculation, I found that only the first condition calculation is correct and didn't go for the other conditions. (df.one, df.two, and df.three are other columns.)

  MWTP1980       MWTP1985         MWTP1990  
 25.87096095    30.72758886  37.04060109  
 -77.06996017   20.00112954      95.22533503  
 -290.1012655   -640.6304196    -1068.866556  
 -1845.172654   -1718.865351    -1641.61201  
 -1397.638671   -2171.737373    -2873.130596  

Upvotes: 1

Views: 1886

Answers (2)

jezrael
jezrael

Reputation: 862691

You can use numpy.select and for get columns names format:

for i in range (1980,2015,5):
    m1 = df["DWWC{}".format(i)] <=18.25
    #inverted m1 mask by ~
    m2 = ~m1 & (df["DWWC{}".format(i)] <= 36.5)
    a = (((10-33)/(5))*(df["DWWC{}".format(i)]-5))+10 
    b = ((10/(df.two-df.three))*(df["DWWC{}".format(i)]-df.three))+df.Three
    c = (((df.Three_value-6)/(df.three-5))*(df["DWWC{}".format(i)]-6

    df["MWTP{}".format(i)] = np.select([m1,m2],[a,b], default=c)

Upvotes: 1

Dor Shinar
Dor Shinar

Reputation: 1512

I believe your problem is the usage of if elif else as follows:

if any(df["DWWC"+str(i)] <=18.25):
// executes if confidion is true
elif any((df["DWWC"+str(i)] >  18.25) &  (df["DWWC"+str(i)] <= 36.5)):
// executes if first condition is false and second condition is true
else:
// executes if both condition are false

So when your first condition is met, it never checks the other ones. Try changing it to something like that:

if any(df["DWWC"+str(i)] <=18.25):
// executes if first condition is true
if any((df["DWWC"+str(i)] >  18.25) &  (df["DWWC"+str(i)] <= 36.5)):
// executes if second condition is true, regardless of the first
else:
// all other if's are false

Upvotes: 1

Related Questions