An Ignorant Wanderer
An Ignorant Wanderer

Reputation: 1612

Keeping track of feature names when doing Feature Selection

When doing feature selection with the feature_selection function from sklearn, is there a way to keep track of actual feature names instead of the default "f1", "f2", etc...? I have a huge number of features so I can't manually keep track. Obviously, I can write code to do this but I'm wondering if there's just some easy option that I can set.

Upvotes: 1

Views: 822

Answers (1)

lalfab
lalfab

Reputation: 381

If you have a pandas dataframe you can return the names of the columns selected by the function, you just need to use get_support method.

Here you have a quick example with some modifications from the official documentation.

import pandas as pd
from sklearn.feature_selection import SelectFromModel
from sklearn.linear_model import LogisticRegression
X = [[ 0.87, -1.34,  0.31, 0],
     [-2.79, -0.02, -0.85, 1],
     [-1.34, -0.48, -2.55, 0],
     [ 1.92,  1.48,  0.65, 1]]

df = pd.DataFrame(X, columns=['col1', 'col2', 'col3', 'label'])
train_x = df.loc[:, ['col1',  'col2', 'col3']]
y = df.label
selector = SelectFromModel(estimator=LogisticRegression()).fit(train_x, y)

col_index = selector.get_support()
print(train_x.columns[col_index])
# output print --> Index(['col2'], dtype='object') 

Upvotes: 2

Related Questions