user1960836
user1960836

Reputation: 1782

Getting error when trying to use cross validation

Trying to get a result out, but getting this error instead:

C:\Users\my_is\anaconda3\lib\site-packages\sklearn\model_selection\_validation.py:548: FitFailedWarning: Estimator fit failed. The score on this train-test partition for these parameters will be set to nan. Details: 
Traceback (most recent call last):
  File "C:\Users\my_is\anaconda3\lib\site-packages\sklearn\model_selection\_validation.py", line 531, in _fit_and_score
estimator.fit(X_train, y_train, **fit_params)
  File "C:\Users\my_is\anaconda3\lib\site-packages\sklearn\tree\_classes.py", line 890, in fit
super().fit(
  File "C:\Users\my_is\anaconda3\lib\site-packages\sklearn\tree\_classes.py", line 181, in fit
check_classification_targets(y)
  File "C:\Users\my_is\anaconda3\lib\site-packages\sklearn\utils\multiclass.py", line 172, in check_classification_targets
raise ValueError("Unknown label type: %r" % y_type)
ValueError: Unknown label type: 'continuous'

Here is my code:

from sklearn.model_selection import cross_validate
from sklearn.model_selection import cross_val_score
from sklearn.linear_model import LogisticRegression
from sklearn.datasets import load_boston
from sklearn.model_selection import train_test_split
import numpy as np
from sklearn.tree import DecisionTreeClassifier

data = load_boston()
    
c = np.array([1 if y > np.median(data['target']) else 0 for y in data['target']])
X_train, X_test, c_train, c_test = train_test_split(data['data'], c, random_state=0)

tree = DecisionTreeClassifier()
tree.fit(X_train, c_train)
#print(data.target)

#logReg = LogisticRegression()    
#logReg.fit(X_train, c_train)
#result = cross_validate(logReg, data.data, data.target, cv=5, return_train_score=True)
result = cross_validate(tree, data.data, data.target, cv=5, return_train_score=True)
    
display(result)

I am completely new to python and ML, any help is appreciated

Upvotes: 0

Views: 382

Answers (1)

Danylo Baibak
Danylo Baibak

Reputation: 2326

You have a mistake here:

result = cross_validate(tree, data.data, data.target, cv=5, return_train_score=True)

Should be:

result = cross_validate(tree, data.data, c, cv=5, return_train_score=True)

Upvotes: 1

Related Questions