Reputation: 21
When applying this decision tree algothrim on the data, I am facing the below mentioned error- Can anyone help resolving this?
from sklearn.tree import DecisionTreeRegressor
regressor = DecisionTreeRegressor()
regressor.fit(X_train, y_train)
Error: ValueError: Input contains NaN, infinity or a value too large for dtype('float32').
Upvotes: 2
Views: 1821
Reputation:
sklearn is telling you that you have missing values in your X_train or y_train. This is very common in real world datasets. As machine learning algorithms generally require numeric values for the maths to work, you need to fill in missing values somehow. Common methods include replacing them with the mean, median or mode of the column.
Here's a comprehensive guide to imputing missing values in sklearn: https://scikit-learn.org/stable/modules/impute.html
Upvotes: 1