Reputation: 761
I'm working in Python and I'm trying to get my the f1 score from my trained model. The documentation lists the syntax as:
f1_score(y_true, y_pred, average='macro')
but I cannot figure out what y_true and y_pred are supposed to be. Logically y_true should be the true value of y and y_pred is supposed to be the predicted value of y but by that definition I can only check one value at a time, am I missing something, or is there a way to check it against the entire dataset?
Upvotes: 1
Views: 802
Reputation: 1064
The F-score is a weight average of the precision and recall of your dataset. i.e. What portion of your predictions were true and what portion of trues did you predict: https://en.wikipedia.org/wiki/F1_score
I believe that Sklearn's function wants an array or matrix of labels for y_true and y_pred, where y_true is "actual label of i-th element" and y_pred is "predicted/classified label of the i-th element". The order of each must be matched! The ordering is what allows Sklean compute F-score for all predictions instead of just a single value.
e.g. If I use a classifier/model to make predictions on 5 people to get cancer:
y_pred = [True, False, True, False, False]
and I find out that only the 3rd person got cancer:
y_true = [False, False, True, False False]
Check the example in the Sklearn docs for more: http://scikit-learn.org/stable/modules/generated/sklearn.metrics.f1_score.html
Upvotes: 1