Reputation: 51
astype
raises an ValueError
when using dict of columns
.
I am trying to convert the type of sparse column in a big DF (from float to int). My problem is with the NaN
values. They are not ignored while using a dict of columns even if the errors
parameter is set to 'ignore'
.
Here is a toy example:
t=pd.DataFrame([[1.01,2],[3.01, 10], [np.NaN,20]])
t.astype({0: int}, errors='ignore')
ValueError: Cannot convert non-finite values (NA or inf) to integer
Upvotes: 4
Views: 9373
Reputation: 33773
You can use the new nullable integer dtype in pandas 0.24.0+. You'll first need to convert any floats that aren't exactly equal to integers to be equal to integer values (e.g. rounding, truncating, etc.) before using astype
:
In [1]: import numpy as np; import pandas as pd; pd.__version__
Out[1]: '0.24.2'
In [2]: t = pd.DataFrame([[1.01, 2],[3.01, 10], [np.NaN, 20]])
In [3]: t.round().astype('Int64')
Out[3]:
0 1
0 1 2
1 3 10
2 NaN 20
Upvotes: 5
Reputation: 46291
Try this:
t.astype('int64', copy=False, errors='ignore')
Will output:
0 1
0 1.01 2
1 3.01 10
2 NaN 20
As per the doc this may be a dtype
.
UPDATE:
t=pd.DataFrame([[1.01,2],[3.01, 10], [np.NaN,20]],
columns=['0', '1'])
t.astype({'0': 'int64', '1': 'int64'}, errors='ignore')
I tried also to add column names to you dataset, but in failure. May be some notation quirks, a bug or a problem with in place copy.
Upvotes: 1
Reputation:
Try this:
out = t.fillna(99999).astype(int)
final = out.replace(99999, 'Nan')
Output:
0 1
0 1 2
1 3 10
2 Nan 20
Upvotes: 0