Reputation: 559
Working with currency data in Python Pandas/Numpy, I am experiencing "unexpected" rounding behavior due to the float representation of decimals. For example
import numpy as np
np.round(19.95 * 0.9, 2)
17.949999999999999
Calculating in decimals as I learned in school, the result is 17.955 so I would expect rounding to 17.96.
How can I represent numbers in Python Pandas to achieve "decimal style" rounding? I have seen the Decimal module and the currency package, but I'm not sure how to efficiently use them with Pandas data frames.
SOLUTION:
In case anybody else is interested, this is how I solved the problem:
from decimal import Decimal
import pandas as pd
def float2dec(value):
"Convert `value` to Decimal cents."
return Decimal(value).quantize(Decimal('0.01'))
df = pd.DataFrame({'price': [1.90, 1.95, 19.95]})
df['price_reduced'] = (df['price'].map(float2dec) * float2dec(0.9)).map(float2dec)
# price price_reduced
#0 1.90 1.71
#1 1.95 1.76
#2 19.95 17.96
Also, I ended up using the cdecimal implementation, which uses the same syntax but is some 10x faster.
Upvotes: 4
Views: 2836
Reputation: 544
You can use the Decimal type in pandas dataframes. Everything should work as expected since the Decimal type implements it's own versions of addition and multiplication, which pandas will happily use behind the scenes for sum, cumsum, etc.
It won't be as fast as floating points, but using precise decimal representations is never as fast as using floating points.
I wouldn't worry about efficiency unless you actually run into performance problems in your program.
http://rhodesmill.org/brandon/2014/pandas-payroll/
Upvotes: 4