ValueError: Input contains NaN, infinity or a value too large for dtype('float32'). Why?

Question

I have gone through all the similar questions but none of them answer my query. I am using random forest classifier as follows:

from sklearn.ensemble import RandomForestClassifier
clf = RandomForestClassifier(n_estimators=100, max_depth=2, random_state=0)
clf.fit(X_train, y_train)
clf.predict(X_test)

It's giving me this error:

ValueError: Input contains NaN, infinity or a value too large for dtype('float32').

However, when I do X_train.describe() I don't see any missing values. In fact, actually, I already took care of the missing values before even splitting my data.

When I do the following:

np.where(X_train.values >= np.finfo(np.float32).max)

I get:

(array([], dtype=int64), array([], dtype=int64))

And for these commands:

np.any(np.isnan(X_train)) #true
np.all(np.isfinite(X_train)) #false

And after getting the above results, I also tried this:

X_train.fillna(X_train.mean())

but I get the same error and it doesn't fix anything.

Please tell me where I'm going wrong. Thank you!

ValueError: Input contains NaN, infinity or a value too large for dtype('float32'). Why?

Answers (1)

Related Questions

ValueError: Input contains NaN, infinity or a value too large for dtype(&#39;float32&#39;). Why?

Answers (1)

Related Questions

ValueError: Input contains NaN, infinity or a value too large for dtype('float32'). Why?