What is the relation between F1 score and classification error?

Question

I am using K-fold cross validation to find a parameter that maximizes my F1 score. I however checked the accuracy(1-errorRate) and it turns out that although the parameter gave a high F1 score, it gave a low accuracy. I tried randomly a few other values for the parameter and even though they gave a lower F1 score, the accuracy was higher. I used separate data for training using k fold and a test set extracted from the original training data.

lejlot · Accepted Answer

F1 = 2 TP / (2 TP + FN + FP) and ACC = (TP + TN) / (TP + FN + FP + FN) thus as you can see F1 is "biased" towards positive class and does not give mu much for correct cassification of the negatives samples (TN). While accuracy is a simple probabilistic object (how probable is correct classification) F1 is one of many quite arbitrary - ideas to focus more on one class (in this case - positive one), without a really nice probabilistic interpretation. Consequently there is no nice, straight forward relation - completely different models will have good f1 score and completely different good accuracy. Only in the case when you can have a perfect model (0 error) it will maximize both measures (and symmetrically if you have a terrible one, with 0 accuracy). In any other case, they will disagree at some point.

What is the relation between F1 score and classification error?

Answers (1)

Related Questions