RobZ
RobZ

Reputation: 516

How to update by finding and replacing a value in a dataframe with conditions?

My goal is to find and replace values inside a specific column based on a condition from an other column. And make replacements according to the specific row.

Let's take for instance:

import pandas
data = [['red', 'not done'], ['red', 'not done'], ['green', 'not done']]
df = pandas.DataFrame(data, columns = ['color', 'status']) 
print(df)

We have our output DataFrame:

   color    status
0    red  not done
1    red  not done
2  green  not done

My goal is to make all green color status to be changed into a done status. Just like:

   color    status
0    red  not done
1    red  not done
2  green    done

What I have tried:

df['status'] = df['status'].replace(to_replace = [df['color'] == 'green'], value = 'done')

But it is not doing anything.

I also tried: df['status'] = df.where(cond = [df['color'] == 'green'] , other = 'done') but this leads me to ValueError: Array conditional must be same shape as self error which I don't understand.

How can I replace what I want correctly?

Upvotes: 1

Views: 617

Answers (3)

Quang Hoang
Quang Hoang

Reputation: 150805

Some lines that fix your code:

This line

df['status'] = df.where(cond = [df['color'] == 'green'] , other = 'done')

should be either

df['status'] = df['status'].mask(df['color'] == 'green' , 'done')

or:

df['status'] = df['status'].where(df['color'] == 'red' , 'done')

Upvotes: 1

SchwarzeHuhn
SchwarzeHuhn

Reputation: 648

import numpy as np

df['status'] = np.where(df['color']=='green', 'done', 'not done')

Upvotes: 1

gold_cy
gold_cy

Reputation: 14236

A simple way to do a mass update is to use df.loc

df.loc[df.color == 'green', 'status'] = 'done'

   color    status
0    red  not done
1    red  not done
2  green      done

Upvotes: 2

Related Questions