Anatoliy Federer
Anatoliy Federer

Reputation: 17

Split the data in R, split into percentage

I have a dataset corresponding to different types datasets. Then how it is possible to calculate case.

Data should be split into one case: 1) First Case - 15% of train data & 5% test

How to write it correctly?

Upvotes: 0

Views: 985

Answers (1)

yarnabrina
yarnabrina

Reputation: 1676

Without createDataPartition, an easy way will be as follows.

Suppose you want train_prop as training set and test_prop as test set from the dataset my_dataset. Ideally, their sum will be 1, or 1-val_prop, but here you want 15% and 5% for some reason. So you'll need 0.15 and 0.05 respectively.

indices <- sample(x = rep.int(x = c(0, 1, 2),
                  times = round(nrow(my_dataset) * c(1 - train_prop - test_prop, train_prop, test_prop))))
train_set <- my_dataset[indices == 1,]
test_set <- my_dataset[indices == 2,]

Upvotes: 1

Related Questions