Reputation: 2219
I am trying to write a conditional where I want to get the average cost per visit in a new column. However, if the visit count is 0, then I want to use '1' instead so that the Visit_cost
is not infinity.
My approach is throwing an error:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
df
id Cost Visit
2 52 1
3 85 0
4 853 10
df['VISIT_COST'] = [df['Cost']/df['Visit'] if df['Visit'] != 0 else df['Cost']/1]
final df:
id Cost Visit VISIT_COST
2 52 1 52
3 85 0 85
4 853 10 85.3
Upvotes: 2
Views: 1825
Reputation: 862781
I think best here is use numpy.where
, because very fast:
df['VISIT_COST'] = np.where(df['Visit'] != 0, df['Cost']/df['Visit'], df['Cost'])
print (df)
id Cost Visit VISIT_COST
0 2 52 1 52.0
1 3 85 0 85.0
2 4 853 10 85.3
Out of box solution - for replace 0
to 1
add boolean mask converted to integer:
df['VISIT_COST'] = df['Cost'].div(df['Visit'].eq(0).astype(int).add(df['Visit']))
print (df)
id Cost Visit VISIT_COST
0 2 52 1 52.0
1 3 85 0 85.0
2 4 853 10 85.3
Detail:
print (df['Visit'].eq(0).astype(int))
0 0
1 1
2 0
Name: Visit, dtype: int32
print (df['Visit'].eq(0).astype(int).add(df['Visit']))
0 1
1 1
2 10
Name: Visit, dtype: int64
Upvotes: 4
Reputation: 323306
I think we should using mask
#df['VISIT_COST'] = np.where(df['Visit'] != 0,df['Cost']/df['Visit'] , df['Cost']/1)
df['VISIT_COST']=df.Cost.mask(df['Visit'] != 0,df['Cost']/df['Visit'])
Upvotes: 2