Utpal Dutt
Utpal Dutt

Reputation: 403

reading a large number from excel with pandas

I am reading a xlsx file with pandas and a Column contain 18 digit number for example 360000036011012000

after reading the number is converted to 360000036011011968

my code

import pandas as pd
df = pd.read_excel("Book1.xlsx")

I also tried converting the column to string but the results are same

df = pd.read_excel("Book1.xlsx",dtype = {"column_name":"str" })

also tried with engine = 'openpyxl'

also if the same number is in csv file there is no problem reading works fine but I have to read it from excel only.

Upvotes: 6

Views: 1567

Answers (1)

Andreas
Andreas

Reputation: 9197

That is an Excel problem, not a pandas problem. See here:

enter image description here

The yellow marked entries, are actually the number below * 10 +1 so should not end on 0. enter image description here

What happens under the hood in Excel seems to be a number limit of 18. But the last two numbers are interpreted as decimals. Since this is a Excel not a CSV problem, a csv will work just fine.

Solution:

Format the numbers in Excel as Text, as shown in the first picture with: =Text(CELL,0). Pandas can then import it as string, but you will lose the information of the last digits. Therefore Excel should not be used for numbers with more than 18 digits. Use a different file, like csv, insert the numbers directly as strings into excel by using a leading: ' symbol.

Upvotes: 1

Related Questions