Alex
Alex

Reputation: 649

Can't handle mix of multiclass and continuous

the output has four classes: [0,1,2,3] the prediction is continuous number in [0,1] (after using sigmoid function)

I have tried confusion matrix, f1_score in sklearn, but there is an error in both case:

ValueError: Can't handle mix of multiclass and continuous

If I reduce it into binary classifier and use AUC to evaluate it, there is no error, which means that AUC can handle continuous inputs.

My question is where can I find an evaluation in sklearn so that not only deal with multi-classes but also handle with contiuous inputs.

Upvotes: 1

Views: 9544

Answers (1)

ginge
ginge

Reputation: 1972

Before dealing with the particulars of your problem you need to make sure you understand the AUC metric and how to use it properly.

To understand what the AUC metric mean you can start here.

In essence you want to get a list of predictions based on different thresholds (i.e. move them around and get predictions every time), calculate your false positive and false negative rates for each instance of thresholds and then calculate your AUC over them.

Calculating and evaluating multi-class AUC is not straight-forward. You can find more information here, but I attach below a good code snippet to get you started.

# Compute macro-average ROC curve and ROC area

# First aggregate all false positive rates, 
# assuming fpr is a list of false positive values per class
all_fpr = np.unique(np.concatenate([fpr[i] for i in range(n_classes)]))

# Then interpolate all ROC curves at this points
mean_tpr = np.zeros_like(all_fpr)
for i in range(n_classes):
    mean_tpr += interp(all_fpr, fpr[i], tpr[i])

# Finally average it and compute AUC
mean_tpr /= n_classes

fpr["macro"] = all_fpr
tpr["macro"] = mean_tpr
roc_auc["macro"] = auc(fpr["macro"], tpr["macro"])

# Plot all ROC curves
plt.figure()
plt.plot(fpr["micro"], tpr["micro"],
     label='micro-average ROC curve (area = {0:0.2f})'
           ''.format(roc_auc["micro"]),
     color='deeppink', linestyle=':', linewidth=4)

plt.plot(fpr["macro"], tpr["macro"],
     label='macro-average ROC curve (area = {0:0.2f})'
           ''.format(roc_auc["macro"]),
     color='navy', linestyle=':', linewidth=4)

colors = cycle(['aqua', 'darkorange', 'cornflowerblue'])
for i, color in zip(range(n_classes), colors):
    plt.plot(fpr[i], tpr[i], color=color, lw=lw,
         label='ROC curve of class {0} (area = {1:0.2f})'
         ''.format(i, roc_auc[i]))

plt.plot([0, 1], [0, 1], 'k--', lw=lw)
plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.05])
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('Some extension of Receiver operating characteristic to multi-class')
plt.legend(loc="lower right")
plt.show()

Upvotes: 2

Related Questions