Reputation: 291
What I'm trying to do is use the sample function in R to split up a sequence of numbers into several equal parts for later use, but I'm having a hard time not getting repeated digits even when I indicate that I don't want replacement values.
trials <- seq(1,21,1)
set.seed(5)
p1.trials <- sample(trials, 7, replace=F)
p1.trials
This yields the vector : 5, 14, 18, 6, 2, 12, 8
trials <- trials[-p1.trials]
p2.trials <- sample(trials, 7, replace=F)
p2.trials
This yields the vector: 19, 20 , 3 , 7 ,9 , 4 ,16
p3.trials <- trials[-p2.trials]
p3.trials
This yields the vector: 1 , 3 , 9,10 ,13 ,16 ,17, 19, 20, 21
Can anybody help me figure out why (a) I'm getting repeated values (e.g., "3" is in the p2.trials and p3.trials) and (b) why the p3.trials subsetting produces 10 numbers instead of 7?
Upvotes: 0
Views: 1709
Reputation: 1005
You can do the whole thing more efficiently by just using the sample
function once to randomize the sequence, then subset into 3 equal groups.
# Create data
trials <- seq(1,21,1)
set.seed(5)
# Randomize trials before subsetting
random_order <- sample(1:21, replace=FALSE)
trials2 <- trials[random_order]
# Subset
p1.trials <- trials2[1:7]
p2.trials <- trials2[8:14]
p3.trials <- trials2[15:21]
# Check
p1.trials
## 5 14 18 6 2 12 8
p2.trials
## 16 13 17 4 21 3 10
p3.trials
## 20 7 19 11 15 9 1
Upvotes: 0
Reputation: 2952
The first time works, but after the first time, the value of each member of trials and the index of each member of trials are different, the problem is using -
, instead, use setdiff:
trials <- seq(1,21,1)
set.seed(5)
p1.trials <- sample(trials, 7, replace=F)
p1.trials
trials <- setdiff(trials,p1.trials)
Upvotes: 1