Reputation: 63
Data frame=reviews
I get the following errror when I try to convert rating column to integer
''Cannot convert non-finite values (NA or inf) to integer''
how can I fix it?
reviews.replace([np.inf, -np.inf], np.nan)
reviews.dropna()
reviews['Rating'].astype('int')
Upvotes: 0
Views: 167
Reputation: 8816
The simplest way would be to first replace
infs to NaN and then use dropna
:
Example DataFrame:
>>> df = pd.DataFrame({'col1':[1, 2, 3, 4, 5, np.inf, -np.inf], 'col2':[6, 7, 8, 9, 10, np.inf, -np.inf]})
>>> df
col1 col2
0 1.000000 6.000000
1 2.000000 7.000000
2 3.000000 8.000000
3 4.000000 9.000000
4 5.000000 10.000000
5 inf inf
6 -inf -inf
Solution 1:
Create a df_new
that way you will not loose the real dataframe and desired dataFrame will ne df_new
separately..
>>> df_new = df.replace([np.inf, -np.inf], np.nan).dropna(subset=["col1", "col2"], how="all").astype(int)
>>> df_new
col1 col2
0 1 6
1 2 7
2 3 8
3 4 9
4 5 10
Solution 2:
using isin
and ~
:
>>> ff = df.isin([np.inf, -np.inf, np.nan]).all(axis='columns')
>>> df[~ff].astype(int)
col1 col2
0 1 6
1 2 7
2 3 8
3 4 9
4 5 10
OR Directly into original Dataframe, Use pd.DataFrame.isin
and check for rows that have any with pd.DataFrame.any
. Finally, use the boolean array to slice the dataframe.
>>> df = df[~df.isin([np.nan, np.inf, -np.inf]).any(1)].astype(int)
>>> df
col1 col2
0 1 6
1 2 7
2 3 8
3 4 9
4 5 10
above taken from here courtesy to the @piRSquared
Solution 3:
You have liberty to use dataFrame.mask
+ numpy.isinf
and the using dronna()
:
>>> df = df.mask(np.isinf(df)).dropna().astype(int)
>>> df
col1 col2
0 1 6
1 2 7
2 3 8
3 4 9
4 5 10
Upvotes: 1
Reputation: 5741
Both .replace()
and .dropna()
do not perform their actions in place, e.g. modify the existing dataframe unless you specify them to. However if you do specify to perform them in place your code would work:
reviews.replace([np.inf, -np.inf], np.nan, inplace=True)
reviews.dropna(inplace=True)
reviews['Rating'].astype('int')
Or:
reviews = reviews.replace([np.inf, -np.inf], np.nan)
reviews = reviews.dropna()
reviews['Rating'].astype('int')
Upvotes: 0