Reputation: 67
I am working on a dataset with several missing values in its attributes.
Having done the typical procedure of data preprocessing, my next step is trying to do to fit a regression model to impute missing values. However, when I try to use the IterativeImputer
from fancyimpute
. I run in to this error:
C:\Users\User.DC241-12\Anaconda3\lib\site-packages\sklearn\linear_model\ridge.py:942: RuntimeWarning: overflow encountered in square
v = s ** 2
****hierarchy of filenames in which error is happening****
Input contains NaN, infinity or a value too large for dtype('float64')
I understand that missing values input to the IterativeImputer
are to be represented as NaN
s so I guess that is not the reason here. Should I be scaling my data to before passing on to the imputation process. But wouldnt that affect the imputation process?
Thanks!
Upvotes: 0
Views: 1243
Reputation: 83
I had a similar issue to this. The issue for me was that some of my values being fed into the imputer were quite large (values > 10,000,000) and had a large dataset (500,000+ rows). These large values get compounded somehow in the algorithm that IterativeImputer uses, and overflow numpy's float64
.
Try scaling your values, imputing, and then scaling back up (reverse the process of scaling down) once the imputation is done.
Upvotes: 0