GCa
GCa

Reputation: 51

DataFrame.astype() errors parameter

astype raises an ValueError when using dict of columns.

I am trying to convert the type of sparse column in a big DF (from float to int). My problem is with the NaN values. They are not ignored while using a dict of columns even if the errors parameter is set to 'ignore' .

Here is a toy example:

t=pd.DataFrame([[1.01,2],[3.01, 10], [np.NaN,20]])
t.astype({0: int}, errors='ignore')

ValueError: Cannot convert non-finite values (NA or inf) to integer

Upvotes: 4

Views: 9373

Answers (4)

root
root

Reputation: 33773

You can use the new nullable integer dtype in pandas 0.24.0+. You'll first need to convert any floats that aren't exactly equal to integers to be equal to integer values (e.g. rounding, truncating, etc.) before using astype:

In [1]: import numpy as np; import pandas as pd; pd.__version__
Out[1]: '0.24.2'

In [2]: t = pd.DataFrame([[1.01, 2],[3.01, 10], [np.NaN, 20]])

In [3]: t.round().astype('Int64')
Out[3]:
     0   1
0    1   2
1    3  10
2  NaN  20

Upvotes: 5

Nev1111
Nev1111

Reputation: 1049

Try

t_new=t.mask(t.notnull(),t.values.astype(int))

Upvotes: 0

prosti
prosti

Reputation: 46291

Try this:

t.astype('int64', copy=False, errors='ignore')

Will output:

    0   1
0   1.01    2
1   3.01    10
2   NaN     20

As per the doc this may be a dtype.


UPDATE:

t=pd.DataFrame([[1.01,2],[3.01, 10], [np.NaN,20]],
              columns=['0', '1'])
t.astype({'0': 'int64', '1': 'int64'}, errors='ignore')

I tried also to add column names to you dataset, but in failure. May be some notation quirks, a bug or a problem with in place copy.

Upvotes: 1

user7313188
user7313188

Reputation:

Try this:

out = t.fillna(99999).astype(int)
final = out.replace(99999, 'Nan')

Output:

     0   1
0    1   2
1    3  10
2  Nan  20

Upvotes: 0

Related Questions