user572575
user572575

Reputation: 1049

Sklearn evaluate accuracy, precision, recall, f1 show same result

I want to evaluate with accuracy, precision, recall, f1 like this code but it show same result.

df = pd.read_csv(r'test.csv')

X = df.iloc[:,:10]
Y = df.iloc[:,10]

X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.2)

clf = DecisionTreeClassifier()

clf = clf.fit(X_train,y_train)

predictions = clf.predict(X_test)

accuracy = accuracy_score(y_test, predictions)
precision = precision_score(y_test, predictions,average='micro')
recall = recall_score(y_test, predictions,average='micro')
f1 = f1_score(y_test, predictions,average='micro')

print("Accuracy: ", accuracy)
print("precision: ", precision)
print("recall: ", recall)
print("f1: ", f1)

It show output like this.

Accuracy:  0.8058823529411765
precision:  0.8058823529411765
recall:  0.8058823529411765
f1:  0.8058823529411765

The output is same value. How to fix it?

Upvotes: 4

Views: 1170

Answers (1)

Jannik
Jannik

Reputation: 1015

According to sklearn's documentation, the behavior is expected when using micro as average and when dealing with a multiclass setting:

Note that if all labels are included, “micro”-averaging in a multiclass setting will produce precision, recall and F that are all identical to accuracy.

Here is a nice blog article describing why these scores can be equal (also with an intuitive example)

TL;DR

  1. F1 equals recall and precision if recall == precision
  2. In the case of micro averaging, the number of false positive always equals the number of false negative. Thus, recall == precision
  3. Finally, note that micro F1 always equals accuracy. See here

Upvotes: 6

Related Questions