Reputation: 1782
Trying to get a result out, but getting this error instead:
C:\Users\my_is\anaconda3\lib\site-packages\sklearn\model_selection\_validation.py:548: FitFailedWarning: Estimator fit failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
File "C:\Users\my_is\anaconda3\lib\site-packages\sklearn\model_selection\_validation.py", line 531, in _fit_and_score
estimator.fit(X_train, y_train, **fit_params)
File "C:\Users\my_is\anaconda3\lib\site-packages\sklearn\tree\_classes.py", line 890, in fit
super().fit(
File "C:\Users\my_is\anaconda3\lib\site-packages\sklearn\tree\_classes.py", line 181, in fit
check_classification_targets(y)
File "C:\Users\my_is\anaconda3\lib\site-packages\sklearn\utils\multiclass.py", line 172, in check_classification_targets
raise ValueError("Unknown label type: %r" % y_type)
ValueError: Unknown label type: 'continuous'
Here is my code:
from sklearn.model_selection import cross_validate
from sklearn.model_selection import cross_val_score
from sklearn.linear_model import LogisticRegression
from sklearn.datasets import load_boston
from sklearn.model_selection import train_test_split
import numpy as np
from sklearn.tree import DecisionTreeClassifier
data = load_boston()
c = np.array([1 if y > np.median(data['target']) else 0 for y in data['target']])
X_train, X_test, c_train, c_test = train_test_split(data['data'], c, random_state=0)
tree = DecisionTreeClassifier()
tree.fit(X_train, c_train)
#print(data.target)
#logReg = LogisticRegression()
#logReg.fit(X_train, c_train)
#result = cross_validate(logReg, data.data, data.target, cv=5, return_train_score=True)
result = cross_validate(tree, data.data, data.target, cv=5, return_train_score=True)
display(result)
I am completely new to python and ML, any help is appreciated
Upvotes: 0
Views: 382
Reputation: 2326
You have a mistake here:
result = cross_validate(tree, data.data, data.target, cv=5, return_train_score=True)
Should be:
result = cross_validate(tree, data.data, c, cv=5, return_train_score=True)
Upvotes: 1