Reputation: 25999
I have the following dataframe and I want to compare column value and predicted, if they match then I want to set the value of a column "provided" to False. I'm having difficulty doing this.
Here's my data:
ticker periodDate value predicted
0 ibm 2017 150079.080 150079.080
1 ibm 2016 49799.140 49799.140
2 ibm 2015 459.016 45949.016
I want a new column to just have a True/False if value and predicted match. I tried this but to no avail:
def provideOrPredicted(df):
if df['value'] == df['predicted']:
df['provided'] = False
elif df['value'] != df['predicted']:
df['provided'] = False
print(df)
provideOrPredicted(MergedDF)
I get this error:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
Upvotes: 5
Views: 7518
Reputation: 33938
Because the result of your ==
/!=
comparisons is vectorized.
(or equivalently df['value'].ne(df['predicted'])
But the base-Python if
command knows nothing about pandas and numpy, so it can't handle vectors (only scalars like 'True' and 'False').
So do the (vectorized) assignment directly in pandas, without any if-statement:
df['provided'] = df['value'].ne(df['predicted'])
Upvotes: -2
Reputation: 1643
Basically, below line will check each row and boolean result will be assigned into the new column of provided
as:
df['provided'] = df['value'] == df['predicted']
Upvotes: 6