arthur
arthur

Reputation: 173

Using astype in Pandas does not give the expected result

I'm trying to convert a float to int in a Pandas dataframe. I usually use .astype('int64') but, in this case, it is not working. This is the code I'm using:

import pandas as pd
d = {'test': [1]}
df = pd.DataFrame(columns= ['test'], data =d)

df['test'] = 60590820065001969.0
df['test'].astype('int64')

This is the result I get:

0    60590820065001968
Name: test, dtype: int64

Please notice how those numbers are different (the float ends with 69 and the integer version ends with 68).

If i try a smaller number, by removing the first 2 digits, then it works fine:

df['test'] = 590820065001969.0
df['test'].astype('int64')

Gives me :

0    590820065001969
Name: test, dtype: int64

Which makes me think it might have something to do with the number size, but I'm not sure what it is. Can anyone spot the problem here? By the way, I'm using Python 3.

Upvotes: 4

Views: 167

Answers (1)

bigbounty
bigbounty

Reputation: 17408

60590820065001969.0 is too large for python to be represented precisely in the floating point format. Hence, python picks the nearest value that it's sure of representing correctly.

Using decimal library

In [16]: import decimal

In [17]: a = decimal.Decimal("60590820065001969.0")

In [18]: int(a)
Out[18]: 60590820065001969

Upvotes: 2

Related Questions