Stuart C
Stuart C

Reputation: 175

Sklearn feature selection

I have been unable to use any of the Sklearn feature extraction methods without getting the following error:

"TypeError: cannot perform reduce with flexible type"

Working from examples, the feature extraction methods appear to only work for non-classification problems. I am of course, trying to do a classification problem. How can I fix this?

Example code:

from sklearn.feature_selection import RFE
from sklearn.linear_model import LinearRegression
from sklearn.datasets import load_boston
import random

# Load data
boston = load_boston()
X = boston["data"]
Y = boston["target"]

# Make a classification problem
classes = ['a', 'b', 'c']
Y = [random.choice(classes) for entry in Y]

# Perform feature selection
names = boston["feature_names"]
lr = LinearRegression()
rfe = RFE(lr, n_features_to_select=1)
rfe.fit(X, Y)

print "Features sorted by their rank:"
print sorted(zip(map(lambda x: round(x, 4), rfe.ranking_), names))

Upvotes: 1

Views: 798

Answers (1)

Wasi Ahmad
Wasi Ahmad

Reputation: 37761

I guess the following will solve your problem.

X = np.array(X, dtype = 'float_')
Y = np.array(X, dtype = 'float_')

Do it before calling the fit method. You can also use int_ instead of float_. It totally depends on the data type you need.

If your labels are string, then you can use LabelEncoder to encode the labels into integers.

from sklearn import preprocessing    
le = preprocessing.LabelEncoder()
le = le.fit_transform(Y)
model.fit(X, le)

Upvotes: 1

Related Questions