José de Castro
José de Castro

Reputation: 11

ValueError: Unknown label type: 'unknown' - sklearn

This is my dataframe:

df.head()

First I tried to rescale it using MinMaxScaler:

array = df.values
X = array[:,1:5]
Y = array[:,5]

from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler(feature_range = (0, 1))
rescaledX = scaler.fit_transform(X)
print(rescaledX[0:5,:])

[[1.         1.         1.         1.        ]
 [0.62941362 0.69159574 0.72880726 0.65628435]
 [0.72207955 0.53431153 0.61756924 0.62263943]
 [0.61745053 0.48542381 0.49937301 0.52598285]
 [0.45269065 0.54966355 0.57468495 0.48724943]] 

Then I tried to use RFE and LogisticRegression:

from sklearn.feature_selection import RFE
from sklearn.linear_model import LogisticRegression
model = LogisticRegression()
rfe = RFE(model, 2)
fit = rfe.fit(rescaledX, Y)
print("Number of attributes: %d" % fit.n_features_)
print(df.columns[0:5])
print("Attributes Selected: %s" % fit.support_)
print("Attribute Ranking: %s" % fit.ranking_)

But all I get is a ValueError message:

ValueError: Unknown label type: 'unknown'

Can someone please help me identify my mistake?

Upvotes: 1

Views: 2653

Answers (1)

Alex Serra Marrugat
Alex Serra Marrugat

Reputation: 2042

LogisticRegression is not for regression, it's used for classification problem.

If you want to use LogisticRegression, the y variable must be a classification class (for example: 0, 1, 2, 3), and not a continuous variable as you have.

You should use LinearRegression algorithm, for example, do deal with continuous outputs.

Upvotes: 1

Related Questions