Reputation: 167
I have a dataframe frame in which I need to iterate through one of the columns and apply certain conditional statements for using one or the other set of equations.
I've written the code below. However, I'm not getting the right result. In the code, the input_data variable is checked for positive values, but the condition is not met when encountering a negative value and always applies the equations for the case of positive values.
thanks in advance for any advice on this
import pandas as pd
x=[-1,1]
y=[2,3]
df=pd.DataFrame({'x':x, 'y':y})
print(df)
x y
0 -1 2
1 1 3
input_data=df['x']
for i in range(len(input_data)):
if input_data[i]>0:
df['z']=input_data[i]+1
df['z2']=df['z']+1
df['z3']=1
else:
df['z']=input_data[i]-1
df['z2']=df['z']-1
df['z3']=0
print(df)
x y z z2 z3
0 -1 2 2 3 1
1 1 3 2 3 1
Upvotes: 1
Views: 33
Reputation: 41327
In pandas, loops are generally implemented with apply()
:
df[['z','z2','z3']] = df.apply(
lambda row: [row.x+1, row.x+2, 1] if row.x > 0 else [row.x-1, row.x-2, 0],
result_type='expand',
axis=1)
# x y z z2 z3
# 0 -1 2 -2.0 -3.0 0.0
# 1 1 3 2.0 3.0 1.0
Or you can use the vectorized np.where()
:
df['z'] = np.where(df.x > 0, df.x + 1, df.x - 1)
df['z2'] = np.where(df.x > 0, df.z + 1, df.z - 1)
df['z3'] = df.x.gt(0).astype(int)
# x y z z2 z3
# 0 -1 2 -2 -3 0
# 1 1 3 2 3 1
As for the for
loop implementation, the issue was due to the assignment statements.
For example df['z3'] = 1
sets the the whole z3
column to 1
(not just any particular row of z3
but the whole column). Similarly df['z3'] = 0
sets the whole column to 0. This applies to all those assignment statements.
So then because the last x
value is positive, the final iteration sets all the z
columns to the positive result.
Upvotes: 1