Matt
Matt

Reputation: 5653

Scikit-learn returning incorrect classification report and accuracy score

I'm training an SVM on 1200 examples of label 2 and 1200 examples of label 1 with an RBF kernel. I thought I was getting 77% accuracy, and I was getting accuracy using sklearn.metrics.accuracy_score. But when I hand-rolled my own precision score, like so:

def naive_accuracy(true, pred):
    number_correct = 0
    i = 0
    for y in true:
        if pred[i] == y:
            number_correct += 1.0
    return number_correct / len(true)

It got 50%. I believe I've wasted weeks of work based on a false accuracy score and classification report. Can anyone supply me with an explanation for why this has happened? I'm very, very confused as to how this could have happened. I don't see what I'm doing wrong. And when I tested the metrics.accuracy_score function on some dummy data like pred = [1, 1, 2, 2]; test = [1, 2, 1, 2], and it gave me 50% like you'd expect. I think accuracy_score might be erring due to my specific data somehow.

I have 27-feature vectors and 1200 vectors of class 1 and 1200 vectors of class 2. My code is the following:

X = scale(np.asarray(X))
y = np.asarray(y)
X_train, X_test, y_train, y_test = train_test_split(X, y)

######## SVM ########
clf = svm.SVC()
clf.fit(X_train, y_train)
y_pred = clf.predict(X_test)
# 77%
print "SVM Accuracy:", accuracy_score(y_test, y_pred) # debugging
# 50%
print "*True* SVM Accuracy:", naive_accuracy(y_test, y_pred) # in-house debugging
# also 77%!
print "Classification report:\n", classification_report(y_test, y_pred) # debugging

Upvotes: 0

Views: 2410

Answers (1)

Aman
Aman

Reputation: 8985

Your implementation of naive_score is buggy. You are comparing the first element with all the others (i is never updated).

I would've just left a comment if not for the test case you've designed, which prevented you from zeroing in on the bug yourself.

Try running your code with:

pred = list([1, 2, 2, 2]); 
test = list([1, 1, 1, 1])

The accuracy returned will be 1.0!

Also worth noting is the fact that if the classes are uniformly distributed, then the expected accuracy returned by the buggy code can be shown to be 50% on any random test set.

It is also a good idea to have a test suite with several test cases. A single test case can rarely test all the possible scenarios in non trivial cases.

Though not really needed, here is what you should do instead:

def naive_accuracy(true, pred):
    number_correct = 0
    i = 0
    for i, y in enumerate(true):
        if pred[i] == y:
            number_correct += 1.0
    return number_correct / len(true)

Upvotes: 6

Related Questions