Reputation: 453
I have a pandas dataframe that looks like this:
val_1 val_2 Flag
Date
2018-08-27 221.0 121.0 0
2018-08-28 222.0 122.0 1
2018-08-29 223.0 123.0 0
2018-08-30 224.0 124.0 2
2018-08-31 225.0 125.0 0
I want to change the Flag column values to the same values from other columns based on Flag condition. Namely, if Flag is 1 replace 1 with val_1 from the same row and if Flag is 2 replace it with val_2. The output that I am looking would look like this:
val_1 val_2 Flag
Date
2018-08-27 221.0 121.0 0
2018-08-28 222.0 122.0 222.0
2018-08-29 223.0 123.0 0
2018-08-30 224.0 124.0 124.0
2018-08-31 225.0 125.0 0
I know that I can use .loc
like this df.loc[df['Flag'] == 1, ['Flag']] =
. I don't know what goes to the right hand side of the code.
Upvotes: 5
Views: 7775
Reputation: 150735
Try this:
new_vals = df.lookup(df.index, df.columns[df.Flag-1])
df['Flag'] = df.Flag.mask(df.Flag>0, new_val)
Note: as commented by @Erfan, this would also work:
df['Flag'] = df.lookup(df.index, df.columns[df.Flag-1])
Output:
val_1 val_2 Flag
Date
2018-08-27 221.0 121.0 0
2018-08-28 222.0 122.0 222
2018-08-29 223.0 123.0 0
2018-08-30 224.0 124.0 124
2018-08-31 225.0 125.0 0
Upvotes: 3
Reputation: 26676
One other way is to use np.where for numpy.where(condtion,yes,no)
In this case, I use nested np.where
so that
np.where(If Flag=2,take val_2,(take x)) where takex is another np.where
df['Flag']=np.where(df['Flag']==1,df['val_1'],(np.where(df['Flag']==2,df['val_2'],df['Flag'])))
df
Output
Upvotes: 4
Reputation: 23099
Few ways you could do this, firstly your initial code is very close, you just need to end the assignment :
df.loc[df['Flag'] == 1, 'Flag'] = df['val_1']
print(df)
Date val_1 val_2 Flag
0 2018-08-27 221.0 121.0 0.0
1 2018-08-28 222.0 122.0 222.0
2 2018-08-29 223.0 123.0 0.0
3 2018-08-30 224.0 124.0 2.0
4 2018-08-31 225.0 125.0 0.0
what you're doing here is filtering your dataframe and replacing the values where the conditions matches. in this iinstance where Flag is equal to one.
since you're making muliple assingments, lets use np.select
import numpy as np
conditions = [df['Flag'].eq(1),
df['Flag'].eq(2)]
choices = [df['val_1'],df['val_2']]
df['Flag'] = np.select(conditions,choices,default=df['Flag'])
What this this does is evaulate any and all conditions you have. leaving the default as the original column. You can add more conditions in, and wrap OR statements in parenthsis with a | (pipe) sepreators. i.e [(df['Flag'] == 1 | df['Flag'] == 2)]
Date val_1 val_2 Flag
0 2018-08-27 221.0 121.0 0.0
1 2018-08-28 222.0 122.0 222.0
2 2018-08-29 223.0 123.0 0.0
3 2018-08-30 224.0 124.0 124.0
4 2018-08-31 225.0 125.0 0.0
Upvotes: 4