Reputation: 329
I am trying to split my data into K-folds with train and test set. I am stuck at the end:
I have a data set example:
[1,2,3,4,5,6,7,8,9,10]
I have successful created the partition for 5-fold cross validation and the output is
fold=[[2, 1], [6, 0], [7, 8], [9, 5], [4, 3]]
Now I want to create K such instances having K-1 training data and 1 validation set.
I am using this code:
```
cross_val={"train":[],"test":[]}
new_fold=folds.copy()
for i in range(4):
val=folds.pop(i)
cross_val["train"].append(folds)
cross_val["test"].append(val)
folds[i:i]=[val]```
The output that I am getting is:
{'train': [[[6, 0], [7, 8], [9, 5], [4, 3]],
[[6, 0], [7, 8], [9, 5], [4, 3]],
[[6, 0], [7, 8], [9, 5], [4, 3]],
[[6, 0], [7, 8], [9, 5], [4, 3]]],
'test': [[6, 0], [7, 8], [9, 5], [4, 3]]}
This is the wrong output that I am getting.
But I want the output as
train test
[[6, 0], [7, 8], [9, 5], [4, 3]] [2,1]
[[2, 1], [7, 8], [9, 5], [4, 3]] [6,0]
[[6, 0], [2, 1], [9, 5], [4, 3]] [7,8]
[[6, 0], [7, 8], [9, 5], [2, 1]] [4,3]
[[6, 0], [7, 8], [2, 1], [4, 3]] [9,5]
Upvotes: 1
Views: 11884
Reputation: 477794
You here each time make edits to the same list, and append that list multiple times. As a result if you edit the list, you see that edit in all elements of the list.
You can create a cross-fold validation with:
train = []
test = []
cross_val={'train': train, 'test': test}
for i, testi in enumerate(fold):
train.append(fold[:i] + fold[i+1:])
test.append(testi)
For the given sample data, this gives us:
>>> pprint(cross_val)
{'test': [[2, 1], [6, 0], [7, 8], [9, 5], [4, 3]],
'train': [[[6, 0], [7, 8], [9, 5], [4, 3]],
[[2, 1], [7, 8], [9, 5], [4, 3]],
[[2, 1], [6, 0], [9, 5], [4, 3]],
[[2, 1], [6, 0], [7, 8], [4, 3]],
[[2, 1], [6, 0], [7, 8], [9, 5]]]}
Upvotes: 2