Reputation: 187
I am applying backward elimination using statsmodels.api and the code gives this error `TypeError: ufunc 'isfinite' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''
I have no clue how to solve it
here is the code
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
from sklearn.compose import ColumnTransformer
import statsmodels.api as smf
data = pd.read_csv('F:/Py Projects/ML_Dataset/50_Startups.csv')
dataSlice = data.head(10)
#get data column
readX = data.iloc[:,:4].values
readY = data.iloc[:,4].values
#encoding c3
transformer = ColumnTransformer(
transformers=[("OneHot",OneHotEncoder(),[3])],
remainder='passthrough' )
readX = transformer.fit_transform(readX.tolist())
readX = readX[:,1:]
trainX, testX, trainY, testY = train_test_split(readX,readY,test_size=0.2,random_state=0)
lreg = LinearRegression()
lreg.fit(trainX, trainY)
predY = lreg.predict(testX)
readX = np.append(arr=np.ones((50,1),dtype=np.int),values=readX,axis=1)
optimisedX = readX[:,[0,1,2,3,4,5]]
ols = smf.OLS(endog=readX, exog=optimisedX).fit()
print(ols.summary())
here is the error message
Traceback (most recent call last):
File "F:/Py Projects/ml/BackwardElimination.py", line 33, in <module>
ols = smf.OLS(endog=readX, exog=optimisedX).fit()
File "C:\Users\udit\AppData\Local\Programs\Python\Python37\lib\site-packages\statsmodels\regression\linear_model.py", line 838, in __init__
hasconst=hasconst, **kwargs)
File "C:\Users\udit\AppData\Local\Programs\Python\Python37\lib\site-packages\statsmodels\regression\linear_model.py", line 684, in __init__
weights=weights, hasconst=hasconst, **kwargs)
File "C:\Users\udit\AppData\Local\Programs\Python\Python37\lib\site-packages\statsmodels\regression\linear_model.py", line 196, in __init__
super(RegressionModel, self).__init__(endog, exog, **kwargs)
File "C:\Users\udit\AppData\Local\Programs\Python\Python37\lib\site-packages\statsmodels\base\model.py", line 216, in __init__
super(LikelihoodModel, self).__init__(endog, exog, **kwargs)
File "C:\Users\udit\AppData\Local\Programs\Python\Python37\lib\site-packages\statsmodels\base\model.py", line 68, in __init__
**kwargs)
File "C:\Users\udit\AppData\Local\Programs\Python\Python37\lib\site-packages\statsmodels\base\model.py", line 91, in _handle_data
data = handle_data(endog, exog, missing, hasconst, **kwargs)
File "C:\Users\udit\AppData\Local\Programs\Python\Python37\lib\site-packages\statsmodels\base\data.py", line 635, in handle_data
**kwargs)
File "C:\Users\udit\AppData\Local\Programs\Python\Python37\lib\site-packages\statsmodels\base\data.py", line 80, in __init__
self._handle_constant(hasconst)
File "C:\Users\udit\AppData\Local\Programs\Python\Python37\lib\site-packages\statsmodels\base\data.py", line 125, in _handle_constant
if not np.isfinite(ptp_).all():
TypeError: ufunc 'isfinite' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''
Upvotes: 4
Views: 16794
Reputation: 11
Today I received the same error.
The root cause is converting numpy dtype
object
to float64
and assigning to it a new variable and using this variable in a function.
X[1:3]
#array([[1, 0.0, 0.0, 162597.7, 151377.59, 443898.53],
# [1, 1.0, 0.0, 153441.51, 101145.55, 407934.54]], dtype=object)
X.dtype
#dtype('O')
X1= X.astype(np.float64)
X1[1:2]
#array([[1.0000000e+00, 0.0000000e+00, 0.0000000e+00, 1.625977e+05, 1.5137759e+05, 4.4389853e+05]])
X1.dtype
#dtype('float64')
Upvotes: 1
Reputation: 2465
just add this line,
X_opt = X[:, [0, 1, 2, 3, 4, 5]]
X_opt = np.array(X_opt, dtype=float) # <-- this line
convert it to the array and change the datatype.
Upvotes: 5
Reputation: 51
U need to change the datatype of the readX to int or float64 using numpy. astype( ) function before optimisedX is initialize. Also change endog to readY
readX.astype('float64')
optimisedX = readX[:,[0,1,2,3,4,5]]
ols = smf.OLS(endog=readY, exog=optimisedX).fit()
print(ols.summary())
Upvotes: 5