Reputation: 665
I have a list of dataframes:
df1 <- data.frame(a = 1:4, b = 3:6)
df2 <- data.frame(a = c(5,3,4,4), b = c(9,9,1,0))
df_list <- list(df1, df2)
I want to create a new list with df1_testing, df1_training, df2_testing, and df2_training datasets, with a 75-25 split between train and test sets.
Upvotes: 1
Views: 99
Reputation: 52024
You can do this. You could also change the function to make the probability to split (here 0.75) a parameter.
split2 <- function(df){
sample <- sample(x = 1:nrow(df), size = floor(.75*nrow(df)), replace = F)
list(test = df[sample,], train = df[-sample,])
}
lapply(df_list, split2)
Which gives:
[[1]]
[[1]]$test
a b
1 1 3
3 3 5
2 2 4
[[1]]$train
a b
4 4 6
[[2]]
[[2]]$test
a b
1 5 9
2 3 9
3 4 1
[[2]]$train
a b
4 4 0
Upvotes: 1