Shan
Shan

Reputation: 3

How to do K-fold cross validation without using python libraries?

I am trying to do a cross validation, however, I am only allowed to use those libraries below (as the professor demanded):

import numpy as np
from sklearn import svm
from sklearn.datasets import load_iris

Therefore, I am not able to use KFold for example to split the data. How should I go about it? Any suggestions?

k = 10
kf = KFold(n_splits=k, random_state=None)

acc_score = []

for train_index , test_index in kf.split(X):
  X_train , X_test = X[train_index,:],X[test_index,:]
  y_train , y_test = y[train_index] , y[test_index]
  
  SVC = train_model(X_train, y_train)
  y_pred = make_predictions(SVC, X_test)
  
  acc = accuracy_score(y_pred , y_test)
  acc_score.append(acc)
 
mean = sum(acc_score)/k

I was thinking on hard coding the split, but there could be a better solution.

Upvotes: 0

Views: 523

Answers (1)

bujian
bujian

Reputation: 192

You can use the np.random library to generate random samples for the train split indices. Example:

import numpy as np

np.random.seed(42)
train_indices = np.random.choice(len(train), int(0.8*len(train)), replace=False)

Then you can have the validation/test indices be the opposite of those, you can use different random seeds to get reproducible results for each of your folds

Upvotes: 1

Related Questions