johng
johng

Reputation: 21

Pandas dataframe division returning 'inf' when dividing by non-zero

When I tried to create a new column in my pandas dataframe by dividing an existing column by another existing column, I am getting 'inf' in rows where there is no division by zero.

claims_report['% COST DIFFERENCE'] = 100*claims_report['COST DIFFERENCE']/claims_data['ORIGINAL UNIT COST']
print(claims_report[['ORIGINAL UNIT COST','COST DIFFERENCE','% COST DIFFERENCE']].head(9))

The result of the above code is:

   ORIGINAL UNIT COST  COST DIFFERENCE  % COST DIFFERENCE
0              4.3732          11.2500         257.248697
1              3.7935          22.0000         579.939370
2              6.9167          22.0000         318.070756
3              1.1429           4.5000         393.735235
4              0.0000           7.3269                inf
5              7.3269          -0.8622         -11.767596
6              6.4647           0.7853          12.147509
7              0.2590           0.0170           6.563707
8             14.4471         -12.7145               -inf

By my calculations, there should not be a -inf in row 8. As a check I ran the following code:

for i in range(9):
print(i, claims_report['COST DIFFERENCE'][i], claims_report['ORIGINAL UNIT COST'][i], claims_report['COST DIFFERENCE'][i]/claims_report['ORIGINAL UNIT COST'][i])

Which gives me the expected result in row 8:

0 11.25 4.3732 2.5724869660660388 
1 22.0 3.7935 5.799393699749571 
2 22.0 6.9167 3.180707562855119 
3 4.5 1.1429 3.937352349286902 
4 7.3269 0.0 inf 
5 -0.8622 7.3269 -0.11767596118412971 
6 0.7853 6.4647 0.1214750877844293 
7 0.017 0.259 0.06563706563706564 
8 -12.7145 14.4471 -0.880072817382035

Anyone familiar with this type of issue?

Upvotes: 0

Views: 2852

Answers (2)

Arnav
Arnav

Reputation: 286

Another solution in the future may be to do:

import pandas as pd
pd.set_option('use_inf_as_na', True)

which sets any values in your pandas dataframe from 'inf' to 'nan'. Then you can use the fillna method like this:

df = df.fillna(value=0, inplace=True)

Upvotes: 0

Victor Lira
Victor Lira

Reputation: 101

In your first line

claims_report['% COST DIFFERENCE'] = 100*claims_report['COST DIFFERENCE']/claims_data['ORIGINAL UNIT COST']

Didn't you mean "claims_report" instead of "claims_data"? Maybe you're just selecting the wrong dataframe?

Upvotes: 1

Related Questions