Reputation: 970
I am trying to apply RFECV on KNeighborsClassifier to eliminate insignificant features. In order to make the issue repeatable, here is an example with iris data:
from sklearn.datasets import load_iris
from sklearn.feature_selection import RFECV
from sklearn.neighbors import KNeighborsClassifier
iris = load_iris()
y = iris.target
X = iris.data
estimator = KNeighborsClassifier()
selector = RFECV(estimator, step=1, cv=5)
selector = selector.fit(X, y)
which results in the following error massage:
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-27-19f0f2f0f0e7> in <module>()
7 estimator = KNeighborsClassifier()
8 selector = RFECV(estimator, step=1, cv=5)
----> 9 selector.fit(X, y)
C:...\Anaconda3\lib\site-packages\sklearn\feature_selection\rfe.py in fit(self, X, y)
422 verbose=self.verbose - 1)
423
--> 424 rfe._fit(X_train, y_train, lambda estimator, features:
425 _score(estimator, X_test[:, features], y_test, scorer))
426 scores.append(np.array(rfe.scores_[::-1]).reshape(1, -1))
C:...\Anaconda3\lib\site-packages\sklearn\feature_selection\rfe.py in _fit(self, X, y, step_score)
180 coefs = estimator.feature_importances_
181 else:
--> 182 raise RuntimeError('The classifier does not expose '
183 '"coef_" or "feature_importances_" '
184 'attributes')
RuntimeError: The classifier does not expose "coef_" or "feature_importances_" attributes
If I change the classifier to a SVC as:
from sklearn.datasets import load_iris
from sklearn.feature_selection import RFECV
from sklearn.svm import SVC
iris = load_iris()
y = iris.target
X = iris.data
estimator = SVC(kernel="linear")
selector = RFECV(estimator, step=1, cv=5)
selector = selector.fit(X, y)
it would work fine. Any suggestions on how to address the issue?
NOTE: I updated Anaconda yesterday which also updated the sklearn.
Upvotes: 1
Views: 15456
Reputation: 837
You may have a partial solution from the mlxtend
library:
http://rasbt.github.io/mlxtend/user_guide/feature_selection/SequentialFeatureSelector/
see https://github.com/rasbt/mlxtend
As for Scikit-learn see:
https://github.com/scikit-learn/scikit-learn/issues/6920
Upvotes: 1
Reputation: 66825
Error is pretty self explanatory - knn does not provide logic to do feature selection. You cannot use it (sklearn's implementation) to achieve such goal, unless you define your own measure of feature importance for KNN. As far as I know - there is no such general object, and so - scikit-learn does not implement it. SVM on the other hand, like every linear model - provides such information.
Upvotes: 4