user3058703
user3058703

Reputation: 591

Performing function in new column based on condition in other column

Here I am attempting to query a column in dataframe df, which has boolean values 'Yes' or 'No', in order to perform some function of random letter assignment according to a probability distribution in rows where the condition is met.

if (df['some_bool'] == 'Yes'):
    df['score'] = np.random.choice(['A', 'B', 'C'], len(df), p=[0.3, 0.2, 0.5])

What is a correct way of writing this as I receive the following error for the above:

ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Thanks!

Upvotes: 0

Views: 35

Answers (1)

ParalysisByAnalysis
ParalysisByAnalysis

Reputation: 733

Try this instead:

df['score'] = np.where(df['some_bool'] == 'Yes', 
                       np.random.choice(['A', 'B', 'C'], len(df), p=[0.3, 0.2, 0.5]), '')

Upvotes: 1

Related Questions