Mark
Mark

Reputation: 177

Pandas Subtract Two Columns Not Working Correctly

Using the code below, the results should show 0 for Dollar_Different. It works on all other rows except for the 3 below. Any ideas why? I've manually checked in Excel and my result is 0.

df['Dollar_Different'] = df['R_Revenue'] - df['S_Revenue']

Current Results - Wrong

Upvotes: 2

Views: 1710

Answers (3)

Mark
Mark

Reputation: 177

I changed the source data types to integers and everything matches now.

Upvotes: 0

Dave Costa
Dave Costa

Reputation: 48121

Presumably you are using floating-point types (which is the default). Floating point cannot represent every real number exactly. Probably, the numbers in the two columns are actually slightly different, but you are displaying them with only 2 decimal digits.

In applications that deal with financial figures, floating-point types are avoided, for exactly the reason you see here. There are other datatypes designed for properly representing decimal quantities with accuracy. But IIRC pandas does not support any of these.

A simple workaround that you could use would be to represent the revenue quantities as integer amounts: either a single integer field representing the total quantity in cents, or two fields, one for the dollars component and another for the cents component.

Upvotes: 1

drakedog
drakedog

Reputation: 351

Are you sure that R_Revenue and S_Revenue are exactly these values and are not truncated to 2 decimals? You can try first to round using https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.round.html and then subtract again, see if that yields 0.

Upvotes: 0

Related Questions