Tanner Clark
Tanner Clark

Reputation: 691

Replacing Values in Dataframe Based on Condition

I am using a Dataframe in python that has a column of percentages. I would like to replace values that are greater than 50% with 'Likely' and less than with 'Not-Likely'.

Here are the options I found:

df.apply
df.iterrows
df.where

This works for the df.iterrows:

for index, row in df.iterrows():
if row['Chance']>0.50:
    df.loc[index, 'Chance']='Likely'
else:
    df.loc[index, 'Chance']='Not-Likely'

However, I have read that this is not an optimal way of 'updating' values.

How would you do this using the other methods and which one would you recommend? Also, if you know any other methods, please share! Thanks

Upvotes: 0

Views: 64

Answers (1)

chitown88
chitown88

Reputation: 28640

Give this a shot.

import numpy as np

df['Chance'] = np.where(df['Chance'] > 0.50, 'Likely', 'Not-Likely')

This will however make anything = to .50 as 'Not-Likely'

Just as a side note, .itertuples() is said to be about 10x faster than .iterrows(), and zip about 100x faster.

Upvotes: 3

Related Questions