Niken Amelia
Niken Amelia

Reputation: 37

Python: how to convert object type to int with big values pandas

I've been searching around for a while now, but I can't seem to find the answer to this small problem.

I have some data, as follows

name     address     age         total_money
joko      china      56      4,5430012000007E+63
feni      korea      35      3,6000016489732E+90
rani     kamboja     54      1,470001179997E+133

and the data type each columns is:

name           3     non-null    object
address        3     non-null    object
age            3     non-null    int64
total_money    3     non-null    object

how to convert data type of total_money column to number/int because i want use the column for arithmetic operations

and I'm trying to convert data type

df = df.astype({'total_money': 'int64'})

but i get error:

OverflowError: int too big to convert

how to handle the problem?

Upvotes: 0

Views: 257

Answers (1)

Mayank Porwal
Mayank Porwal

Reputation: 34046

The problem here is that your numbers are too big to be converted to int. Check this:

In [832]: df['total_money'].str.replace(',', '').astype(float).apply(lambda x: int(x))
Out[832]: 
0    4543001200000699948639040706606013103072417429...
1    3600001648973199763250405754747849609902640529...
2    1470001179997000081335562311854682630184495860...
Name: total_money, dtype: object

If you need an int representation of the values, I suggest you do the above. But the column dtype is still object.

If you'll try to convert this into int, you would get error:

OverflowError: Python int too large to convert to C long

This is because your numbers are exceeding the maxsize allowed by Python:

In [834]: import sys

In [835]: sys.maxsize
Out[835]: 9223372036854775807

So the suggestion would be to convert this into float:

In [837]: df['total_money'].str.replace(',', '').astype(float)
Out[837]: 
0     4.543001e+76
1    3.600002e+103
2    1.470001e+145
Name: total_money, dtype: float64

EDIT after OP's comment on arithmetic operations:

In [840]: df['total_money'].str.replace(',', '').astype(float).apply(lambda x: int(x))/ 1000000000000000000000000000000000000000000000000000
Out[840]: 
0                         45430012000006996125286400.0
1    3600001648973199969342672855872140055708849674...
2    1470001179997000080464234076318923714804535550...
Name: total_money, dtype: object

Upvotes: 2

Related Questions