oma07
oma07

Reputation: 37

Stratified sampling in WEKA

How can I split a data set into a training and test set of sizes 75% and 25% of the original data set, respectively using stratified sampling in order to preserve the proportional class sizes in these new sets. I am trying to do this with WEKA.

The "RemovePercentage" filter helps does not do it in a stratified manner, and the "StratifiedRemoveFolds" filter does not do this using percentages.

I would appreciate any help or suggestion.

Upvotes: 1

Views: 2182

Answers (1)

oma07
oma07

Reputation: 37

So, as a work around, I split the data set into two using stratifiedRemoveFolds. in this case my number of folds was 2, yielding a 50%-50% data set. Then, I split one of the folds into two using the same method, yielding a 25%-25% subset of the original data set. Then I merged one of the 25% data sets to the left over 50% yielding a 75%-25% stratified split - which was my target.

Upvotes: 1

Related Questions