user11916948
user11916948

Reputation: 954

How to code constrained randomisation ( 2 factors)

I want to randomize one factor and another factor should be randomized within the first factor. How do I do that?

id <- rep(c(10,20,30), each=3)
visit <- rep(1:3,3)
df <- data.frame(id, visit)
df
  id visit
1 10     1
2 10     2
3 10     3
4 20     1
5 20     2
6 20     3
7 30     1
8 30     2
9 30     3

it could for example look like this: id visit

1 20     1
2 20     3
3 20     2
4 30     3
5 30     2
6 30     1
7 10     1
8 10     2
9 10     3

Here is code to randomise each id, but I dont know how to put this in a function and then also randomise the second column.

uniq <- unique(df[,1]) %>% sample()

Upvotes: 1

Views: 45

Answers (1)

jay.sf
jay.sf

Reputation: 72758

You could sample by unique id, using sapply.

set.seed(42)
dat$visit <- unlist(lapply(unique(dat$id), function(i) sample(dat$visit[dat$id == i])))
dat
#   id visit
# 1 10     2
# 2 10     1
# 3 10     3
# 4 20     3
# 5 20     1
# 6 20     2
# 7 30     3
# 8 30     1
# 9 30     2

Edit: To sample also the order of the IDs, you could sample the rows afterwards, dat[sample(nrow(dat)), ]. Or all combined in a transform():

set.seed(42)
transform(dat,
          visit=unlist(lapply(unique(dat$id), function(i) 
            sample(dat$visit[dat$id == i]))))[sample(nrow(dat)), ]
#   id visit
# 8 30     3
# 7 30     2
# 4 20     1
# 1 10     1
# 5 20     2
# 2 10     3
# 9 30     1
# 3 10     2
# 6 20     3

To sample the id ranges with sampled visits, you could use a by approach.

set.seed(42)
do.call(rbind, by(dat, dat$id, function(x) {
  transform(x, visit=sample(visit))
})[sample(seq(unique(dat$id)))])
#      id visit
# 30.7 30     2
# 30.8 30     3
# 30.9 30     1
# 20.4 20     1
# 20.5 20     2
# 20.6 20     3
# 10.1 10     1
# 10.2 10     3
# 10.3 10     2

Explanation: The by splits the data at "id"s into a list of data frames, that can be transformed as above, and after sampleing the order rbinded into the resulting data frame.


Data:

(dat <- expand.grid(visit=1:3, id=(1:3)*10)[2:1])
#   id visit
# 1 10     1
# 2 10     2
# 3 10     3
# 4 20     1
# 5 20     2
# 6 20     3
# 7 30     1
# 8 30     2
# 9 30     3

Upvotes: 1

Related Questions