Reputation: 500
I'm working with Python 3.6 and Pandas 1.0.3.
I would like to convert the floats from column "A" to int... This column has some nan values.
So i followed this post with the solution of @jezrael.
But I get the following error: "TypeError: cannot safely cast non-equivalent float64 to int64"
This is my code
import pandas as pd
import numpy as np
data = {'timestamp': [1588757760.0000, 1588757760.0161, 1588757764.7339, 1588757764.9234], 'A':[9087.6000, 9135.8000, np.nan, 9102.1000], 'B':[0.1648, 0.1649, '', 5.3379], 'C':['b', 'a', '', 'a']}
df = pd.DataFrame(data)
df['A'] = pd.to_numeric(df['A'], errors='coerce').astype('Int64')
print(df)
Did I miss something?
Upvotes: 3
Views: 1641
Reputation: 14131
Your problem is that you have true float numbers, not integers in the float form. So for safety reasons pandas will not convert them, because you would be obtained other values.
So you need first explicitely round them to integers, and only then use the.astype()
method:
df['A'] = pd.to_numeric(df['A'].round(), errors='coerce').astype('Int64')
Test:
print(df)
timestamp A B C 0 1.588758e+09 9088 0.1648 b 1 1.588758e+09 9136 0.1649 a 2 1.588758e+09 NaN 3 1.588758e+09 9102 5.3379 a
Upvotes: 4
Reputation: 7594
One way to do it is to convert NaN to a integer:
df['A'] = df['A'].fillna(99999999).astype(np.int64, errors='ignore')
df['A'] = df['A'].replace(99999999, np.nan)
df
timestamp A B C
0 1.588758e+09 9087 0.1648 b
1 1.588758e+09 9135 0.1649 a
2 1.588758e+09 NaN
3 1.588758e+09 9102 5.3379 a
Upvotes: 0