Reputation: 11
This is my dataframe:
First I tried to rescale it using MinMaxScaler:
array = df.values
X = array[:,1:5]
Y = array[:,5]
from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler(feature_range = (0, 1))
rescaledX = scaler.fit_transform(X)
print(rescaledX[0:5,:])
[[1. 1. 1. 1. ]
[0.62941362 0.69159574 0.72880726 0.65628435]
[0.72207955 0.53431153 0.61756924 0.62263943]
[0.61745053 0.48542381 0.49937301 0.52598285]
[0.45269065 0.54966355 0.57468495 0.48724943]]
Then I tried to use RFE and LogisticRegression:
from sklearn.feature_selection import RFE
from sklearn.linear_model import LogisticRegression
model = LogisticRegression()
rfe = RFE(model, 2)
fit = rfe.fit(rescaledX, Y)
print("Number of attributes: %d" % fit.n_features_)
print(df.columns[0:5])
print("Attributes Selected: %s" % fit.support_)
print("Attribute Ranking: %s" % fit.ranking_)
But all I get is a ValueError message:
ValueError: Unknown label type: 'unknown'
Can someone please help me identify my mistake?
Upvotes: 1
Views: 2653
Reputation: 2042
LogisticRegression
is not for regression, it's used for classification problem.
If you want to use LogisticRegression
, the y
variable must be a classification class (for example: 0, 1, 2, 3), and not a continuous variable as you have.
You should use LinearRegression
algorithm, for example, do deal with continuous outputs.
Upvotes: 1