packybear
packybear

Reputation: 677

"OverflowError: Python int too large to convert to C long" on windows but not mac

I am running the exact same code on both windows and mac, with python 3.5 64 bit.

On windows, it looks like this:

>>> import numpy as np
>>> preds = np.zeros((1, 3), dtype=int)
>>> p = [6802256107, 5017549029, 3745804973]
>>> preds[0] = p
Traceback (most recent call last):
  File "<pyshell#13>", line 1, in <module>
    preds[0] = p
OverflowError: Python int too large to convert to C long

However, this code works fine on my mac. Could anyone help explain why or give a solution for the code on windows? Thanks so much!

Upvotes: 65

Views: 270573

Answers (5)

Pratyush Tripathy
Pratyush Tripathy

Reputation: 141

I got the same error while trying to convert a object type column (actually string) to integer type.

This DID NOT work:

df['var1'] = df['var1'].astype(int)

This worked:

df['var1'] = df['var1'].apply(lambda x: int(x))

Upvotes: 5

plugwash
plugwash

Reputation: 10494

Could anyone help explain why

Numpy arrays normally* have fixed size elements, including integers of various sizes, single or double precision floating point numbers, fixed length byte and Unicode strings and structures built up from the aforementioned types.

In Python 2 a python "int" was equivalent to a C long. In Python 3 an "int" is an arbitrary precision type but numpy still uses "int" it to represent the C type "long" when creating arrays.

The size of a C long is platform dependent. On windows it is always 32-bit. On unix-like systems it is normally 32 bit on 32 bit systems and 64 bit on 64 bit systems.

or give a solution for the code on windows? Thanks so much!

Choose a data type whose size is not platform dependent. You can find the list at https://docs.scipy.org/doc/numpy/reference/arrays.scalars.html#arrays-scalars-built-in the most sensible choice would probably be np.int64

* Numpy does allow arrays of python objects, but I don't think they are widely used.

Upvotes: 7

Jatin Malhotra
Jatin Malhotra

Reputation: 79

Convert to float:

import pandas as pd

df = pd.DataFrame()
l_var_l = [8258255190131389999999000003296, 50661]
df['temp'] = l_var_l
df['temp'] = df['temp'].astype(int)

Above fails with error:

OverflowError: Python int too large to convert to C long.

Now try with float conversion:

df['temp'] = df['temp'].astype(float)

Upvotes: 3

sammy ongaya
sammy ongaya

Reputation: 1401

You can use dtype=np.int64 instead of dtype=int

Upvotes: 45

Moses Koledoye
Moses Koledoye

Reputation: 78546

You'll get that error once your numbers are greater than sys.maxsize:

>>> p = [sys.maxsize]
>>> preds[0] = p
>>> p = [sys.maxsize+1]
>>> preds[0] = p
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
OverflowError: Python int too large to convert to C long

You can confirm this by checking:

>>> import sys
>>> sys.maxsize
2147483647

To take numbers with larger precision, don't pass an int type which uses a bounded C integer behind the scenes. Use the default float:

>>> preds = np.zeros((1, 3))

Upvotes: 41

Related Questions