Reputation: 123
I want to make for loop with the condition over columns in panda DataFrame:
import numpy as np
import pandas as pd
df=pd.DataFrame(pd.read_csv("data.csv"))
print df
DWWC1980 DWWC1985 DWWC1990
16.7140310 16.35661439 15.89201716
20.9414479 18.00822799 15.73516051
33.95022337 51.87065104 73.76376497
144.7000805 136.1462017 130.9143924
54.9506033 75.03339188 93.22994974
For loop condition statement:
for i in range (1980,2015,5):
if any(df["DWWC"+str(i)] <=18.25) :
df['MWTP'+str(i)]=(((10-33)/(5))*(df["DWWC"+str(i)]-5))+10
elif any((df["DWWC"+str(i)] > 18.25) & (df["DWWC"+str(i)] <= 36.5)) :
df['MWTP'+str(i)]=((10/(df.two-df.three))*(df["DWWC"+str(i)]-df.three))+df.Three
else :
df['MWTP'+str(i)]=(((df.Three_value-6)/(df.three-5))*(df["DWWC"+str(i)]-6
df.to_csv('MWTP1.csv',index='ISO3')
But When I run this code and compare with manual calculation, I found that only the first condition calculation is correct and didn't go for the other conditions. (df.one, df.two, and df.three are other columns.)
MWTP1980 MWTP1985 MWTP1990
25.87096095 30.72758886 37.04060109
-77.06996017 20.00112954 95.22533503
-290.1012655 -640.6304196 -1068.866556
-1845.172654 -1718.865351 -1641.61201
-1397.638671 -2171.737373 -2873.130596
Upvotes: 1
Views: 1886
Reputation: 862691
You can use numpy.select
and for get columns names format
:
for i in range (1980,2015,5):
m1 = df["DWWC{}".format(i)] <=18.25
#inverted m1 mask by ~
m2 = ~m1 & (df["DWWC{}".format(i)] <= 36.5)
a = (((10-33)/(5))*(df["DWWC{}".format(i)]-5))+10
b = ((10/(df.two-df.three))*(df["DWWC{}".format(i)]-df.three))+df.Three
c = (((df.Three_value-6)/(df.three-5))*(df["DWWC{}".format(i)]-6
df["MWTP{}".format(i)] = np.select([m1,m2],[a,b], default=c)
Upvotes: 1
Reputation: 1512
I believe your problem is the usage of if elif else
as follows:
if any(df["DWWC"+str(i)] <=18.25):
// executes if confidion is true
elif any((df["DWWC"+str(i)] > 18.25) & (df["DWWC"+str(i)] <= 36.5)):
// executes if first condition is false and second condition is true
else:
// executes if both condition are false
So when your first condition is met, it never checks the other ones. Try changing it to something like that:
if any(df["DWWC"+str(i)] <=18.25):
// executes if first condition is true
if any((df["DWWC"+str(i)] > 18.25) & (df["DWWC"+str(i)] <= 36.5)):
// executes if second condition is true, regardless of the first
else:
// all other if's are false
Upvotes: 1