Hrvoje
Hrvoje

Reputation: 15252

Error converting object (string) to Int32: TypeError: object cannot be converted to an IntegerDtype

I get following error while trying to convert object (string) column in Pandas to Int32 which is integer type that allows for NA values.

df.column = df.column.astype('Int32')

TypeError: object cannot be converted to an IntegerDtype

I'm using pandas version: 0.25.3

Upvotes: 24

Views: 37498

Answers (4)

Vikas Garud
Vikas Garud

Reputation: 163

Best way to do this is as follows:

df['col'] = df['col'].apply(pd.to_numeric,errors='coerce').astype(pd.Int32Dtype())

So it will first convert any invalid integer value to NaN first & then to NA

Upvotes: 0

Hrvoje
Hrvoje

Reputation: 15252

It's known bug, as explained here.

Workaround is to convert column first to float and than to Int32.

Make sure you strip your column from whitespaces before you do conversion:

df.column = df.column.str.strip()

Than do conversion:

df.column = df.column.astype('float')  # first convert to float before int
df.column = df.column.astype('Int32')

or simpler:

 df.column = df.column.astype('float').astype('Int32') # or Int64

Upvotes: 41

sng
sng

Reputation: 107

As of v0.24, you can use: df['col'] = df['col'].astype(pd.Int32Dtype())

Edit: I should have mentioned that this falls under the Nullable integer documentation. The docs specify other nullable integer types as well (i.e. Int64Dtype, Int8Dtype, UInt64Dtype, etc.)

Upvotes: 3

Haolin Jia
Haolin Jia

Reputation: 1

Personally, I use df = df.astype({i: type_dict[i] for i in header}, errors='ignore') to deal with this problem. Note that attribute errors is to ignore all kinds of warnings. Though it is very inelegant and possible to cause other critical bugs, it does work in converting np.NAN or string of int like `100` or int like 100 to pandas.Int. Hope this could help you.

Upvotes: 0

Related Questions