Curiosity
Curiosity

Reputation: 169

Read excel data into pandas dataframe to 20 decimal places

I am trying to read data from excel using pandas.ExcelFile() into a dataframe, and then outputting a text file using to_csv.

This output text file should be at precision of 20 decimal places and rounded to 20 decimal places.

If an input value in excel is .005, my code converts it to 0.0050000000000000001 which is incorrect because of the very last digit. It should be 0.0050000000000000000

Here's my code:

dataFrame = pd.ExcelFile("xlFile.xlsx")
pd.set_option("display.precision", 20)
dataFrame.ColumnToRound = dataFrame.ColumnToRound.round(20)
dataFrame.to_csv("out.txt", index=False, sep='\t', float_format='%.20f')

The INPUT excel file would look like this (just one column and one value for simplicity):

ColumnToRound

.005

I have tried using the kwarg float_precision='round_trip' when reading the excel file using pandas.ExcelFile(), but it did not make a difference.

Side note: yes, I know 20 decimal places is a lot and probably more than necessary, but it was not my decision. It has to be 20 decimal places.

Upvotes: 1

Views: 5039

Answers (1)

JohanL
JohanL

Reputation: 6891

Actually, 0.0050000000000000001 is a valid representation of the number. Excel only guarantees precision to 15 significant digits (according to the IEEE 754 specification).

That means, what happens after that is anyone's guess (and choice). Internally Excel represents the number as a double precision float (64 bits), which would actually be 0.005000000000000000104083408558608425664715468883514404296875 but when showing the number a trick is employed, cutting the number after 15 significant digits and then pad with zeros if the formatting of the cell requests more decimals.

Therefore, when showing the number 0.005 with up to 30 decimals in Excel, you will see 0.005000000000000000000000000000 which looks like a better representation of the number, but this is not actually the case, which becomes apparent if further arithmetic operations (including the least significant digits) are performed on the number.

Upvotes: 1

Related Questions