CandleWax
CandleWax

Reputation: 2219

How to perform a calculation with conditionals in pandas?

I am trying to write a conditional where I want to get the average cost per visit in a new column. However, if the visit count is 0, then I want to use '1' instead so that the Visit_cost is not infinity.

My approach is throwing an error:

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all(). 

df

id    Cost          Visit
2     52            1
3     85            0
4     853           10

df['VISIT_COST'] = [df['Cost']/df['Visit'] if df['Visit'] != 0 else df['Cost']/1]

final df:

id    Cost          Visit    VISIT_COST
2     52            1        52
3     85            0        85
4     853           10       85.3

Upvotes: 2

Views: 1825

Answers (2)

jezrael
jezrael

Reputation: 862781

I think best here is use numpy.where, because very fast:

df['VISIT_COST'] = np.where(df['Visit'] != 0, df['Cost']/df['Visit'], df['Cost'])
print (df)
   id  Cost  Visit  VISIT_COST
0   2    52      1        52.0
1   3    85      0        85.0
2   4   853     10        85.3

Out of box solution - for replace 0 to 1 add boolean mask converted to integer:

df['VISIT_COST'] = df['Cost'].div(df['Visit'].eq(0).astype(int).add(df['Visit']))
print (df)
   id  Cost  Visit  VISIT_COST
0   2    52      1        52.0
1   3    85      0        85.0
2   4   853     10        85.3

Detail:

print (df['Visit'].eq(0).astype(int))
0    0
1    1
2    0
Name: Visit, dtype: int32

print (df['Visit'].eq(0).astype(int).add(df['Visit']))
0     1
1     1
2    10
Name: Visit, dtype: int64

Upvotes: 4

BENY
BENY

Reputation: 323306

I think we should using mask

#df['VISIT_COST'] = np.where(df['Visit'] != 0,df['Cost']/df['Visit'] , df['Cost']/1)


df['VISIT_COST']=df.Cost.mask(df['Visit'] != 0,df['Cost']/df['Visit'])

Upvotes: 2

Related Questions