Reputation: 5235
I know sklearn has nice method to get cross validation scores:
from sklearn.model_selection import cross_val_score
clf = svm.SVC(kernel='linear', C=1)
scores = cross_val_score(clf, iris.data, iris.target, cv=5)
scores
I'd like to know scores with specific training and test set:
train_list = [train1, train2, train3] # train1,2,3 is the training data sets
test_list = [test1, test2, test3] # # test1,2,3 is the test data sets
clf = svm.SVC(kernel='linear', C=1)
scores = some_nice_method(clf, train_list, test_list)
Is there such kind of method giving scores of particular separated data set in python?
Upvotes: 3
Views: 1283
Reputation: 2623
My suggestion is to use kfold cross validation like below. In this case, you will get both train, test indices for a particular split along with the accuracy score. In the new version of Sklearn, there are some changes.
from sklearn import svm
from sklearn import datasets
from sklearn.model_selection import KFold
from sklearn.metrics import accuracy_score
iris = datasets.load_iris()
X = iris.data
y = iris.target
clf = svm.SVC(kernel='linear', C=1)
kf = KFold(n_splits=5)
for train_index, test_index in kf.split(range(len(X))):
print("TRAIN:", train_index, "TEST:", test_index)
X_train, X_test = X[train_index], X[test_index]
y_train, y_test = y[train_index], y[test_index]
clf.fit(X_train, y_train)
y_pred = clf.predict(X_test)
score = accuracy_score(y_test, y_pred)
print score
Upvotes: 1
Reputation: 76336
This is exactly two lines of code:
for tr, te in zip(train_list, test_list):
svm.SVC(kernel='linear', C=1).train(X[tr, :], y[tr]).score(X[te, :], y[te])
score(X, y, sample_weight=None)
Returns the mean accuracy on the given test data and labels.
Upvotes: 2