Reputation: 691
I am using a Dataframe in python that has a column of percentages. I would like to replace values that are greater than 50% with 'Likely' and less than with 'Not-Likely'.
Here are the options I found:
df.apply
df.iterrows
df.where
This works for the df.iterrows:
for index, row in df.iterrows():
if row['Chance']>0.50:
df.loc[index, 'Chance']='Likely'
else:
df.loc[index, 'Chance']='Not-Likely'
However, I have read that this is not an optimal way of 'updating' values.
How would you do this using the other methods and which one would you recommend? Also, if you know any other methods, please share! Thanks
Upvotes: 0
Views: 64
Reputation: 28640
Give this a shot.
import numpy as np
df['Chance'] = np.where(df['Chance'] > 0.50, 'Likely', 'Not-Likely')
This will however make anything = to .50 as 'Not-Likely'
Just as a side note, .itertuples()
is said to be about 10x faster than .iterrows()
, and zip
about 100x faster.
Upvotes: 3