cekik
cekik

Reputation: 63

converting int in python

Data frame=reviews

I get the following errror when I try to convert rating column to integer

''Cannot convert non-finite values (NA or inf) to integer''

how can I fix it?

reviews.replace([np.inf, -np.inf], np.nan)
reviews.dropna() 

reviews['Rating'].astype('int')

Upvotes: 0

Views: 167

Answers (2)

Karn Kumar
Karn Kumar

Reputation: 8816

The simplest way would be to first replace infs to NaN and then use dropna :

Example DataFrame:

>>> df = pd.DataFrame({'col1':[1, 2, 3, 4, 5, np.inf, -np.inf], 'col2':[6, 7, 8, 9, 10, np.inf, -np.inf]})

>>> df
       col1       col2
0  1.000000   6.000000
1  2.000000   7.000000
2  3.000000   8.000000
3  4.000000   9.000000
4  5.000000  10.000000
5       inf        inf
6      -inf       -inf

Solution 1:

Create a df_new that way you will not loose the real dataframe and desired dataFrame will ne df_new separately..

>>> df_new = df.replace([np.inf, -np.inf], np.nan).dropna(subset=["col1", "col2"], how="all").astype(int)
>>> df_new
   col1  col2
0     1     6
1     2     7
2     3     8
3     4     9
4     5    10

Solution 2:

using isin and ~ :

>>> ff = df.isin([np.inf, -np.inf, np.nan]).all(axis='columns')
>>> df[~ff].astype(int)
   col1  col2
0     1     6
1     2     7
2     3     8
3     4     9
4     5    10

OR Directly into original Dataframe, Use pd.DataFrame.isin and check for rows that have any with pd.DataFrame.any. Finally, use the boolean array to slice the dataframe.

>>> df = df[~df.isin([np.nan, np.inf, -np.inf]).any(1)].astype(int)
>>> df
   col1  col2
0     1     6
1     2     7
2     3     8
3     4     9
4     5    10

above taken from here courtesy to the @piRSquared

Solution 3:

You have liberty to use dataFrame.mask + numpy.isinf and the using dronna():

>>> df = df.mask(np.isinf(df)).dropna().astype(int)
>>> df
   col1  col2
0     1     6
1     2     7
2     3     8
3     4     9
4     5    10

Upvotes: 1

gosuto
gosuto

Reputation: 5741

Both .replace() and .dropna() do not perform their actions in place, e.g. modify the existing dataframe unless you specify them to. However if you do specify to perform them in place your code would work:

reviews.replace([np.inf, -np.inf], np.nan, inplace=True)
reviews.dropna(inplace=True) 

reviews['Rating'].astype('int')

Or:

reviews = reviews.replace([np.inf, -np.inf], np.nan)
reviews = reviews.dropna() 

reviews['Rating'].astype('int')

Upvotes: 0

Related Questions