Reputation: 23
I am applying a lambda
function on my DataFrame
. But I think I am doing it wrong.
What is the correct way to apply lambda
function ?
input:
ID Name result old_result
A1 Jim Bad Good
A2 Tim Good Good
A3 Matt Good None
code:
df3['avg_result'] = df3['result'].apply(lambda x : x['old_result'] if x['result'] == 'Bad' else x['result'])
Expected Output:
ID Name result old_result avg_result
A1 Jim Bad Good Good
A2 Tim Good Good Good
A3 Matt Good None Good
Upvotes: 0
Views: 844
Reputation: 2248
You are doing df['result'].apply(lambda x ...)
so the value of x
actually is each individual string in the column df['result']. That's why when you try to do x['old_result'], it says "string indices must be integers" because it is doing something like Good['old_result']
which is not possible.
What you need is
df['avg_result'] = df.apply(lambda x : x['old_result'] if x['result'] == 'Bad' else x['result'], axis=1)
What this does is instead of applying the lambda function to each string in df['result'] column, it applies the function to each row in the dataframe (that's where axis=1
comes in).
And within that row you can then do x['old_result']
and it will return you the value of old_result
column in that row.
ID Name result old_result avg_result
0 A1 Jim Bad Good Good
1 A2 Tim Good Good Good
2 A3 Matt Good None Good
In fact, you can do the same thing in a much more readable manner with np.where instead of using .apply
and lambda
df['npwhere_result'] = np.where(df['result']=='Bad', df['old_result'], df['result'])
ID Name result old_result avg_result npwhere_result
0 A1 Jim Bad Good Good Good
1 A2 Tim Good Good Good Good
2 A3 Matt Good None Good Good
Upvotes: 2