Reputation: 69
I have a table containing 6,800,000 rows and 35 columns. I want to generate a batch of 34 tables containing 200,000 rows each. Previously, I've tried:
library(data.table)
table <- fread("dataset.preimp")
table_1 <- table[sample(nrow(table), size = 200000, replace = FALSE) , ]
This generates a table with 200000 randomly sampled rows. If I want to make a second table, the excludes the rows included in this first table, also with 200000 randomly sampled rows, how would I do that?
Upvotes: 2
Views: 104
Reputation: 35297
Split the table into a list of 34 tables, with each row appearing in one table:
table_ids <- sample(rep(1:4, each = 8))
split(mtcars, table_ids)
For your example:
table_ids <- sample(rep(1:34, each = 200000))
table_list <- split(table, table_ids)
Upvotes: 3