Reputation: 997
I use sklearn to impute some time-series which include NaN values. At the moment, I use the following:
from sklearn.preprocessing import Imputer
imp = Imputer(missing_values='NaN', strategy='mean')
signals = imp.fit_transform(array)
in which array
is a numpy array of shape n_points x n_time_steps. It works fine but I get a deprecation warning which suggest I should use SimpleImpute from sklearn.impute. Hence I replaced those lines with the following:
from sklearn.impute import SimpleImputer
imp = SimpleImputer(missing_values='NaN', strategy='mean')
signals = imp.fit_transform(array)
but I get the following error on the last line:
ValueError: 'X' and 'missing_values' types are expected to be both numerical. Got X.dtype=float32 and type(missing_values)=< class 'str'>.
If anybody has any idea on what is the cause of this error be glad if you let me know. I am using Python 3.6.7 with sklearn 0.20.1. Thanks!
Upvotes: 1
Views: 866
Reputation: 3355
If array
contains missing values represented as np.NaN, you should use np.Nan
as the argument to the constructor of SimpleImputer. That's the default argument, so this works:
imp = SimpleImputer(strategy='mean')
Upvotes: 3