How to Fix Floating Point Discrepencies in Python Pandas Dataframes?

Question

I'm reading a CSV file into a Panda's dataframe. When retrieving the data, I'm getting slightly different values then the original data.

I believe it has something to do with the way Python represents decimals. But how do I fix it/work around it?

CSV data example:

1313331280,10.4,0.779
1313334917,10.4,0.316
1313334917,10.4,0.101
1313340309,10.5,0.15
1313340309,10.5,1.8

Pandas dataframe:

df = pd.read_csv(csv_file_full_path, names=['time','price', 'volume'])

The output:

ORDERS_DATA_FRAME.iloc[0]['volume']

source file value = 0.779
the pandas output value = 0.77900000000000003

The data is getting changed when read into the Pandas dataframe. What's the fix?

Sreyantha Chary · Accepted Answer

Though the issue is because of the floating point arithmetic, if you know the maximum number of decimals your column has, you can use round(float_number, number_of_decimals) to get back your normal values. Alternatively, you can read the column as string and then convert it to float by using float(float_number_string).

How to Fix Floating Point Discrepencies in Python Pandas Dataframes?

Answers (1)

Related Questions