user3140106
user3140106

Reputation: 377

How to get the coefficients from RFE using sklearn?

I am using Recursive Feature Estimation (RFE) for feature selection. This works by iteratively taking an estimator such as an SVM classifier, fitting it to the data, and removing the features with the lowest weights (coefficients).

I am able to fit this to the data and perform feature selection. However, I then want to recover the learned weights for each feature from the RFE.

I use the following code to initialise a classifier object and an RFE object, and I fit these to the data.

svc = SVC(C=1, kernel="linear")
rfe = RFE(estimator=svc, n_features_to_select=300, step=0.1)
rfe.fit(all_training, training_labels)

I then try to print the coefficients

print ('coefficients',svc.coef_)

And receive:

AttributeError: 'RFE' object has no attribute 'dual_coef_'

According to sklearn documentation, the classifier object should have this attribute :

coef_ : array, shape = [n_class-1, n_features]
Weights assigned to the features (coefficients in the primal problem). This  is only 
available in the case of a linear kernel.
coef_ is a readonly property derived from dual_coef_ and support_vectors_.

I am using a linear kernel, so that is not the problem.

Can anyone explain why I am unable to recover the coefficients? And is there a way around this?

Upvotes: 4

Views: 10524

Answers (2)

Aman Bassi
Aman Bassi

Reputation: 1

from sklearn.linear_model import LogisticRegression

from sklearn.feature_selection import RFE

reg = LogisticRegression()

rfe = RFE(reg, no of features u want to select)

rfe.fit(X, Y)

print(rfe.support_)

you will get to know which features are important and its a better way of looking it.

Upvotes: -1

user3140106
user3140106

Reputation: 377

2 minutes after posting I took another look at the documentation for RFE and realised a partial solution.

RFE objects have the estimator object as an attribute. Therefore I can call

print ('coefficients',rfe.estimator_.coef_)

And get the coefficients for the top selected features. (ie this returns the coefficients for the top 300 features because I set n_features_to_select=300 earlier).

However, I am still unable to get the coefficients for the remaining unselected features. For each iteration of the RFE, it trains the classifier and gets new coefficients for each feature. Ideally, I would like to access the coefficients that are learned at each iteration.

(So if I start with 3000 features, and use step size 300 features, the first iteration I want to access 3000 coefficients, the next iteration I want 2700 coefficients for the remaining 2700 features, the third iteration I want to access the 2400 coefficients, etc.)

Upvotes: 7

Related Questions