Reputation: 1879
Hello I am working with sklearn and in order to understand better the metrics, I followed the following example of precision_score:
from sklearn.metrics import precision_score
y_true = [0, 1, 2, 0, 1, 2]
y_pred = [0, 2, 1, 0, 0, 1]
print(precision_score(y_true, y_pred, average='macro'))
the result that i got was the following:
0.222222222222
I understand that sklearn compute that result following these steps:
and finally sklearn calculates mean precision by all three labels: precision = (0.66 + 0 + 0) / 3 = 0.22
this result is given if we take this parameters:
precision_score(y_true, y_pred, average='macro')
on the other hand if we take this parameters, changing average='micro' :
precision_score(y_true, y_pred, average='micro')
then we get:
0.33
and if we take average='weighted':
precision_score(y_true, y_pred, average='weighted')
then we obtain:
0.22.
I don't understand well how sklearn computes this metric when the average parameter is set to 'weighted' or 'micro', I really would like to appreciate if someone could give me a clear explanation of this.
Upvotes: 6
Views: 10054
Reputation: 1
your details for 'macro' are correct. This is indeed the mean on all precisions per class/label.
For 'micro', it is the number of all True Positives - so basically how many times the labels do match - divided by the common length of y_true and y_pred (a.k.a : the number of predictions). Here your labels are the same for indices 0 and 3 (and in both case, its is a label 0 correctly predicted), so two times. And 2/6 = 0.33 (rounded).
'weighted' is a weighted version of 'macro'. You compute all your 3 precisions, one for each class, so :
prec_0 = 2/3, prec_1 = 0 and prec_2 = 0
And you assign to each of them a weight equal to their frequency in y_true.
Here, 2 out of 6 are labelled with 0 in y_true.
It is the same for labels 1 and 2.
So the weights are w_0=2/6=1/3 , w_1=1/3 and w_2=1/3
At the end, your "weighted precision" is the sum of the weighted individual ones. So here :
2/3*1/3+0*1/3+0*1/3 = 2/9 = 0.22
(rounded).
And, by the way, with None as value for average, it returns all the precisions as an array, so here : [2/3, 0 ,0].
It exists also 'samples' that acts on more complex binary classification inputs such as :
y_true = [[1, 0], [0, 1], [1, 0]]
y_test = [[1, 0], [0, 1], [0, 1]]
and compute the precision for each (three here) sample.
So : 1, 1 and 0 and average them.
Here : precision_score(y_true, y_pred, average="samples")
returns (1+1+0)/3=2/3, e.g.
Upvotes: 0
Reputation: 25639
'micro'
:
Calculate metrics globally by considering each element of the label indicator matrix as a label.
'macro'
:
Calculate metrics for each label, and find their unweighted mean. This does not take label imbalance into account.
'weighted'
:
Calculate metrics for each label, and find their average, weighted by support (the number of true instances for each label).
'samples'
:
Calculate metrics for each instance, and find their average.
http://scikit-learn.org/stable/modules/generated/sklearn.metrics.average_precision_score.html
For Support measures: http://scikit-learn.org/stable/modules/generated/sklearn.metrics.classification_report.html
Basically, class membership.
3.3.2.12. Receiver operating characteristic (ROC)
The function roc_curve computes the receiver operating characteristic curve, or ROC curve. Quoting Wikipedia :
“A receiver operating characteristic (ROC), or simply ROC curve, is a graphical plot which illustrates the performance of a binary classifier system as its discrimination threshold is varied. It is created by plotting the fraction of true positives out of the positives (TPR = true positive rate) vs. the fraction of false positives out of the negatives (FPR = false positive rate), at various threshold settings. TPR is also known as sensitivity, and FPR is one minus the specificity or true negative rate.”
TN / True Negative: case was negative and predicted negative.
TP / True Positive: case was positive and predicted positive.
FN / False Negative: case was positive but predicted negative.
FP / False Positive: case was negative but predicted positive# Basic terminology
confusion = metrics.confusion_matrix(expected, predicted)
print confusion,"\n"
TN, FP = confusion[0, 0], confusion[0, 1]
FN, TP = confusion[1, 0], confusion[1, 1]
print 'Specificity: ', round(TN / float(TN + FP),3)*100, "\n"
print 'Sensitivity: ', round(TP / float(TP + FN),3)*100, "(Recall)"
Upvotes: 5