vubo
vubo

Reputation: 45

Cross Validation WEKA random

WEKA Cross Validation:

 Classifier cls = new J48();
 Evaluation eval = new Evaluation(data);
 Random rand = new Random(1);  // using seed = 1
 int folds = 10;
 eval.crossValidateModel(cls, data, folds, rand);
 System.out.println(eval.toSummaryString());

What does it mean "rand"? How does cross validation in this case? 10 folds are always mixed?

Thank you!

Upvotes: 1

Views: 874

Answers (1)

applecrusher
applecrusher

Reputation: 5658

What does it mean "rand"?

Rand is an object instance that will randomize the dataset for you. This is used for cross validation purposes. The seed is a component of the randomness.

How does cross validation in this case?

The data set is mixed so that for example if you had data rows (1-100) in order, the data would be randomized so the first 5 might be (77,12,4,7,55) instead of (1,2,3,4,5)

10 folds are always mixed?

It depends on the tools or libraries you use but I don't think so with WEKA. I think it is just taking 1-10 and makes it a set 11-20 and make that a set and so on. This causes bias especially if the data grouped together in a file has similar characteristics. That is why data is best randomized.

Upvotes: 2

Related Questions