Reputation: 45
WEKA Cross Validation:
Classifier cls = new J48();
Evaluation eval = new Evaluation(data);
Random rand = new Random(1); // using seed = 1
int folds = 10;
eval.crossValidateModel(cls, data, folds, rand);
System.out.println(eval.toSummaryString());
What does it mean "rand"? How does cross validation in this case? 10 folds are always mixed?
Thank you!
Upvotes: 1
Views: 874
Reputation: 5658
What does it mean "rand"?
Rand is an object instance that will randomize the dataset for you. This is used for cross validation purposes. The seed is a component of the randomness.
How does cross validation in this case?
The data set is mixed so that for example if you had data rows (1-100) in order, the data would be randomized so the first 5 might be (77,12,4,7,55) instead of (1,2,3,4,5)
10 folds are always mixed?
It depends on the tools or libraries you use but I don't think so with WEKA. I think it is just taking 1-10 and makes it a set 11-20 and make that a set and so on. This causes bias especially if the data grouped together in a file has similar characteristics. That is why data is best randomized.
Upvotes: 2