user2917781
user2917781

Reputation: 291

Problem with sample function in R

What I'm trying to do is use the sample function in R to split up a sequence of numbers into several equal parts for later use, but I'm having a hard time not getting repeated digits even when I indicate that I don't want replacement values.

trials <- seq(1,21,1)
set.seed(5)
p1.trials <- sample(trials, 7, replace=F)
p1.trials

This yields the vector : 5, 14, 18, 6, 2, 12, 8

trials <- trials[-p1.trials]
p2.trials <- sample(trials, 7, replace=F) 
p2.trials

This yields the vector: 19, 20 , 3 , 7 ,9 , 4 ,16

p3.trials <- trials[-p2.trials]
p3.trials

This yields the vector: 1 , 3 , 9,10 ,13 ,16 ,17, 19, 20, 21

Can anybody help me figure out why (a) I'm getting repeated values (e.g., "3" is in the p2.trials and p3.trials) and (b) why the p3.trials subsetting produces 10 numbers instead of 7?

Upvotes: 0

Views: 1709

Answers (2)

SlowLoris
SlowLoris

Reputation: 1005

You can do the whole thing more efficiently by just using the sample function once to randomize the sequence, then subset into 3 equal groups.

# Create data
trials <- seq(1,21,1)
set.seed(5)

# Randomize trials before subsetting 
random_order <- sample(1:21, replace=FALSE)
trials2 <- trials[random_order]

# Subset
p1.trials <- trials2[1:7]
p2.trials <- trials2[8:14]
p3.trials <- trials2[15:21]

# Check
p1.trials
##  5 14 18  6  2 12  8
p2.trials
##  16 13 17  4 21  3 10
p3.trials
##  20  7 19 11 15  9  1

Upvotes: 0

Shape
Shape

Reputation: 2952

The first time works, but after the first time, the value of each member of trials and the index of each member of trials are different, the problem is using -, instead, use setdiff:

trials <- seq(1,21,1)
set.seed(5)
p1.trials <- sample(trials, 7, replace=F)
p1.trials
trials <- setdiff(trials,p1.trials)

Upvotes: 1

Related Questions