Reputation: 516
My goal is to find and replace values inside a specific column based on a condition from an other column. And make replacements according to the specific row.
Let's take for instance:
import pandas
data = [['red', 'not done'], ['red', 'not done'], ['green', 'not done']]
df = pandas.DataFrame(data, columns = ['color', 'status'])
print(df)
We have our output DataFrame:
color status
0 red not done
1 red not done
2 green not done
My goal is to make all green
color status to be changed into a done
status. Just like:
color status
0 red not done
1 red not done
2 green done
What I have tried:
df['status'] = df['status'].replace(to_replace = [df['color'] == 'green'], value = 'done')
But it is not doing anything.
I also tried: df['status'] = df.where(cond = [df['color'] == 'green'] , other = 'done')
but this leads me to ValueError: Array conditional must be same shape as self
error which I don't understand.
How can I replace what I want correctly?
Upvotes: 1
Views: 617
Reputation: 150805
Some lines that fix your code:
This line
df['status'] = df.where(cond = [df['color'] == 'green'] , other = 'done')
should be either
df['status'] = df['status'].mask(df['color'] == 'green' , 'done')
or:
df['status'] = df['status'].where(df['color'] == 'red' , 'done')
Upvotes: 1
Reputation: 648
import numpy as np
df['status'] = np.where(df['color']=='green', 'done', 'not done')
Upvotes: 1