Reputation: 1049
I want to evaluate with accuracy, precision, recall, f1 like this code but it show same result.
df = pd.read_csv(r'test.csv')
X = df.iloc[:,:10]
Y = df.iloc[:,10]
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.2)
clf = DecisionTreeClassifier()
clf = clf.fit(X_train,y_train)
predictions = clf.predict(X_test)
accuracy = accuracy_score(y_test, predictions)
precision = precision_score(y_test, predictions,average='micro')
recall = recall_score(y_test, predictions,average='micro')
f1 = f1_score(y_test, predictions,average='micro')
print("Accuracy: ", accuracy)
print("precision: ", precision)
print("recall: ", recall)
print("f1: ", f1)
It show output like this.
Accuracy: 0.8058823529411765
precision: 0.8058823529411765
recall: 0.8058823529411765
f1: 0.8058823529411765
The output is same value. How to fix it?
Upvotes: 4
Views: 1170
Reputation: 1015
According to sklearn's documentation, the behavior is expected when using micro
as average and when dealing with a multiclass setting:
Note that if all labels are included, “micro”-averaging in a multiclass setting will produce precision, recall and F that are all identical to accuracy.
Here is a nice blog article describing why these scores can be equal (also with an intuitive example)
TL;DR
F1
equals recall
and precision
if recall
== precision
recall
== precision
micro F1
always equals accuracy
. See hereUpvotes: 6