Reputation: 591
Here I am attempting to query a column in dataframe df, which has boolean values 'Yes' or 'No', in order to perform some function of random letter assignment according to a probability distribution in rows where the condition is met.
if (df['some_bool'] == 'Yes'):
df['score'] = np.random.choice(['A', 'B', 'C'], len(df), p=[0.3, 0.2, 0.5])
What is a correct way of writing this as I receive the following error for the above:
ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
Thanks!
Upvotes: 0
Views: 35
Reputation: 733
Try this instead:
df['score'] = np.where(df['some_bool'] == 'Yes',
np.random.choice(['A', 'B', 'C'], len(df), p=[0.3, 0.2, 0.5]), '')
Upvotes: 1