Reputation: 558
I have a pandas DataFrame df
with multiple columns. Now I want to add a new column based on other column values. I found many answers for this on stack that includes np.where
and np.select
. However, in my case, for every if condition (every if/elif/else block), the new column has to choose among 3 values with a specific ratio. For example,
for i in range(df.shape[0]):
if(df.iloc[i]['col1']==x):
df.iloc[i]['new_col']= choose one value between l=['a','b','c'] in 0.3,0.3,0.4 ratio
that is, for all rows satisfying the condition in the if
statement, the elements of list l
should be distributed in the above mentioned ratio to new column.
df
into multiple sub data frames df_sub
for each if-else
condtional statement. Next creating a list using np.random.choices(l,df_sub.shape[0],p=[0.3,0.3,0.4)
where l=['a','b','c']
. Add l
to df_sub
as new column and then join all those sub data frames along axis=0
.Upvotes: 0
Views: 371
Reputation: 150785
Try:
s = df['col1'] == x
df.loc[s, 'new_col'] = np.random.choice(['a','b','c'],
size=s.sum(),
p=[0.3,0.3,0.4])
Upvotes: 1