Reputation: 73
I have to update a data frame column if a condition is met. But there are multiple conditions and multiple values to update to. Therefore I want to do it in a loop.
The data frame is like:
mode car1 car2 bus1 bus2
car1 10 20 5 2
car2 11 22 3 1
bus1 4 4 2 2
bus2 3 4 3 5
I realize the data structure is slightly odd but let's go with this. If mode says car1, I want the new column value to have the value in the column car1. And so on.
My code is like:
targets = ['car1', 'car2', 'bus1', 'bus2']
for target in targets:
df.loc[(df.mode==f'target'),'value']=df.[target]
This works but every it replaces the rows in which the condition isn't met with a NaN. Therefore, I only end up with the new value column containing the value for bus2 in bus2 rows but NaNs in all other rows.
In Stata, I would have written:
gen value = .
foreach x in car1 car2 bus1 bus2 {
replace value = `x' if mode=="`x'"
}
Looking for similar code in Python!
Upvotes: 0
Views: 956
Reputation: 1074
This should work:
df['newcol'] = 0
for key, item in df.iterrows():
df['newcol'].iloc[key] = item[item['mode']]
Upvotes: 0
Reputation: 323226
In pandas
there is lookup
df['newvalue']=df.set_index('mode').lookup(df['mode'],df['mode'])
df
Out[184]:
mode car1 car2 bus1 bus2 newcol newvalue
0 car1 10 20 5 2 10 10
1 car2 11 22 3 1 22 22
2 bus1 4 4 2 2 2 2
3 bus2 3 4 3 5 5 5
Upvotes: 1