Reputation: 113
In dataset first i replaced all missing values with imputer class using mean strategy but it has replaced that with large values in dataset which is resulting into this error. What could be the solution for this or how can i round up values to 2 decimal places. As dataset is containing float values rounding up them to 2 or 3 decimal place would work for me.
Code :
import numpy as np
import pandas as pd
import matplotlib as plt
df=pd.read_csv("C:/Users/asus/Desktop/Life Expectancy Data.csv")
X=df.iloc[:, 4:].values
Y=df.iloc[:,3:4].values
from sklearn.impute import SimpleImputer
imputer=SimpleImputer(missing_values=np.nan,strategy='mean')
imputer.fit(X)
X=imputer.transform(X)
from sklearn.model_selection import train_test_split
X_train,X_test,Y_train,Y_test=train_test_split(X,Y,test_size=0.2,random_state=0)
from sklearn.linear_model import LinearRegression
reg=LinearRegression()
reg.fit(X_train,Y_train)
Upvotes: 1
Views: 631
Reputation: 38
X_train.replace([np.inf, -np.inf], np.nan, inplace=True)
use the above
then replace null value by
X_train.fillna(999, inplace=True)
or
X_train.fillna(X_train.mean(), inplace=True)
Upvotes: 2