neo33
neo33

Reputation: 1879

How does sklearn compute the precision_score metric?

Hello I am working with sklearn and in order to understand better the metrics, I followed the following example of precision_score:

from sklearn.metrics import precision_score
    y_true = [0, 1, 2, 0, 1, 2]

    y_pred = [0, 2, 1, 0, 0, 1]
    print(precision_score(y_true, y_pred, average='macro'))  

the result that i got was the following:

0.222222222222

I understand that sklearn compute that result following these steps:

and finally sklearn calculates mean precision by all three labels: precision = (0.66 + 0 + 0) / 3 = 0.22

this result is given if we take this parameters:

precision_score(y_true, y_pred, average='macro')

on the other hand if we take this parameters, changing average='micro' :

precision_score(y_true, y_pred, average='micro') 

then we get:

0.33

and if we take average='weighted':

precision_score(y_true, y_pred, average='weighted')

then we obtain:

0.22.

I don't understand well how sklearn computes this metric when the average parameter is set to 'weighted' or 'micro', I really would like to appreciate if someone could give me a clear explanation of this.

Upvotes: 6

Views: 10054

Answers (2)

Toby
Toby

Reputation: 1

your details for 'macro' are correct. This is indeed the mean on all precisions per class/label.

For 'micro', it is the number of all True Positives - so basically how many times the labels do match - divided by the common length of y_true and y_pred (a.k.a : the number of predictions). Here your labels are the same for indices 0 and 3 (and in both case, its is a label 0 correctly predicted), so two times. And 2/6 = 0.33 (rounded).

'weighted' is a weighted version of 'macro'. You compute all your 3 precisions, one for each class, so : prec_0 = 2/3, prec_1 = 0 and prec_2 = 0 And you assign to each of them a weight equal to their frequency in y_true. Here, 2 out of 6 are labelled with 0 in y_true. It is the same for labels 1 and 2. So the weights are w_0=2/6=1/3 , w_1=1/3 and w_2=1/3 At the end, your "weighted precision" is the sum of the weighted individual ones. So here : 2/3*1/3+0*1/3+0*1/3 = 2/9 = 0.22 (rounded).

And, by the way, with None as value for average, it returns all the precisions as an array, so here : [2/3, 0 ,0].

It exists also 'samples' that acts on more complex binary classification inputs such as :

y_true = [[1, 0], [0, 1], [1, 0]]
y_test = [[1, 0], [0, 1], [0, 1]] 

and compute the precision for each (three here) sample. So : 1, 1 and 0 and average them. Here : precision_score(y_true, y_pred, average="samples") returns (1+1+0)/3=2/3, e.g.

Upvotes: 0

Merlin
Merlin

Reputation: 25639

'micro':

Calculate metrics globally by considering each element of the label indicator matrix as a label.

'macro':

Calculate metrics for each label, and find their unweighted mean. This does not take label imbalance into account.

'weighted':

Calculate metrics for each label, and find their average, weighted by support (the number of true instances for each label).

'samples':

Calculate metrics for each instance, and find their average.

http://scikit-learn.org/stable/modules/generated/sklearn.metrics.average_precision_score.html

For Support measures: http://scikit-learn.org/stable/modules/generated/sklearn.metrics.classification_report.html

Basically, class membership.

3.3.2.12. Receiver operating characteristic (ROC)

The function roc_curve computes the receiver operating characteristic curve, or ROC curve. Quoting Wikipedia :

“A receiver operating characteristic (ROC), or simply ROC curve, is a graphical plot which illustrates the performance of a binary classifier system as its discrimination threshold is varied. It is created by plotting the fraction of true positives out of the positives (TPR = true positive rate) vs. the fraction of false positives out of the negatives (FPR = false positive rate), at various threshold settings. TPR is also known as sensitivity, and FPR is one minus the specificity or true negative rate.”

TN / True Negative: case was negative and predicted negative.

TP / True Positive: case was positive and predicted positive.

FN / False Negative: case was positive but predicted negative.

FP / False Positive: case was negative but predicted positive# Basic terminology

confusion = metrics.confusion_matrix(expected, predicted)
print confusion,"\n"
TN, FP    = confusion[0, 0], confusion[0, 1]
FN, TP    = confusion[1, 0], confusion[1, 1]

print 'Specificity:        ',  round(TN / float(TN + FP),3)*100, "\n"
print 'Sensitivity:        ',  round(TP / float(TP + FN),3)*100, "(Recall)"

Upvotes: 5

Related Questions