Reputation: 4766
from sklearn.metrics import precision_score
a = [ 1, 2, 1, 1, 2 ]
b = [ 1, 2, 2, 1, 1 ]
print precision_score(a,b, labels = [1])
# 0.6666
print precision_score(a,b, labels = [2])
# 0.5
print precision_score(a,b, labels = [1,2])
# 0.6666
Why are the values same for the first and last case?
Calculating by hand, total precision should be 3/5 = 0.6. But the third case outputs 0.6666, which happens to be the value of the first one.
Edit 1: Added the import path to the function in question.
Upvotes: 2
Views: 291
Reputation: 2111
See here (http://scikit-learn.org/stable/modules/generated/sklearn.metrics.precision_score.html#sklearn.metrics.precision_score) for documentation. I think you need to change the average argument to micro to get the overall precision across the specified labels i.e.:
print precision_score(a,b, labels = [1,2], average='micro')
The default value for average is weighted, which computes a weighted average of precision over the specified labels. If you use micro, according to the documentation, it computes the precision over all true and false positives (presumably all means all the specified labels, but the documentation is not clear on this). I think this is what you want? I have not been able to check this, as I don't know which version of scikit you're using.
Upvotes: 1
Reputation: 363487
You have to tell precision_score
for which label it should compute the precision. What you're seeing is the precision for label 1
:
>>> precision_score(a, b)
0.66666666666666663
>>> precision_score(a, b, pos_label=1)
0.66666666666666663
But you want the precision for label 2
:
>>> precision_score(a, b, pos_label=2)
0.5
Upvotes: 1