user3652621
user3652621

Reputation: 3634

Randomly assign elements repeatedly to a limited number of groups

There are N groups (aka judges, let's say 17), and M elements (let's call them cases, let's say 22) such that 3*M <= 4*N.

N <- LETTERS[1:17]
M <- 1:22

I want to assign each of the N judges 4 or fewer cases, such that each case is evaluated by no more or no fewer than 3 judges, and no judge sees the same case twice.

A : 1, 2, 19
B : 2, 3, 8, 22
...
Q : 1, 2, 12, 10

Any quick and easy way to do it in R?

Tried this so far:

df <- data.frame(ID=rep(M,3))
values <- N
df$values[sample(1:nrow(df), nrow(df), FALSE)] <- rep(values, 4)

Upvotes: 4

Views: 806

Answers (3)

Andrew Taylor
Andrew Taylor

Reputation: 3488

GetJudgeCaseList <- function(CaseList, judgeList, casesAllowed, NumJudges) {
    e <- new.env()
    e$casesLeft <- data.frame(Judges = judgeList, itersLeft = casesAllowed)
    e$judgeList = judgeList
doCase <- function(i) {
pickJudges <- function(NumJudges, judgeList) {
  CurJudges <- sample(judgeList, NumJudges)
  return(CurJudges)
}
case <- pickJudges(NumJudges, e$judgeList)
e$casesLeft[casesLeft$Judges%in%case, 2] <-  e$casesLeft[casesLeft$Judges%in%case, 2]  - 1
e$judgeList <- e$casesLeft$Judges[e$casesLeft$itersLeft!=0]
return(data.frame(Case = CaseList[i], judges = paste0(case, collapse = ", ")))
}
Cases <- do.call(rbind, lapply(1:length(CaseList), doCase))
return(Cases)
}
GetJudgeCaseList(CaseList = c(1:22), judgeList = N, casesAllowed = 4, NumJudges = 3)


   Case  judges
1     1 a, h, o
2     2 k, i, j
3     3 j, q, a
4     4 j, n, p
5     5 g, o, n
6     6 q, g, l
7     7 g, d, i
8     8 b, l, f
9     9 m, b, i
10   10 k, m, c
11   11 l, m, p
12   12 m, o, q
13   13 p, g, b
14   14 p, f, b
15   15 l, e, i
16   16 d, h, o
17   17 d, c, q
18   18 a, f, e
19   19 e, d, c
20   20 e, n, k
21   21 a, k, f
22   22 j, n, c

Upvotes: 2

Frank
Frank

Reputation: 66819

Here's what I would do:

set.seed(1)
rM = sample(M)
rN = sample(N)

tasks  = rep(rM, each=3)
judges = rep(rN, length.out = length(tasks))

matches = data.frame(judges, tasks)

You can verify that your conditions hold true by tabulating:

tab = with(matches, table(judges, tasks))
max(tab) # 1
addmargins(tab)

      tasks
judges  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 Sum
   A    0  0  0  0  0  0  1  1  0  1  1  0  0  0  0  0  0  0  0  0  0  0   4
   B    1  0  0  0  0  0  0  0  0  0  0  0  0  1  0  0  0  1  0  0  1  0   4
   C    0  1  0  0  0  1  0  0  0  0  0  0  0  0  0  0  1  0  0  0  0  1   4
   D    0  0  0  0  0  0  0  0  1  0  0  1  0  0  0  0  0  0  1  1  0  0   4
   E    0  0  0  0  0  1  1  0  0  0  0  0  0  0  0  0  1  0  0  0  0  1   4
   F    0  0  0  0  0  0  1  1  0  0  1  0  0  0  0  0  1  0  0  0  0  0   4
   G    0  0  1  1  1  0  0  0  0  0  0  0  1  0  0  0  0  0  0  0  0  0   4
   H    1  0  0  1  0  0  0  0  0  0  0  0  1  1  0  0  0  0  0  0  0  0   4
   I    0  0  0  0  0  0  0  0  1  0  0  0  0  0  1  0  0  1  0  0  1  0   4
   J    0  0  0  0  0  0  0  0  0  1  0  1  0  0  0  0  0  0  1  1  0  0   4
   K    1  0  0  0  0  0  0  0  0  0  0  0  0  0  1  0  0  1  0  0  1  0   4
   L    0  1  0  0  0  1  0  0  0  0  0  0  0  0  0  1  0  0  0  0  0  1   4
   M    0  0  1  0  1  0  0  0  0  0  0  0  0  0  0  1  0  0  0  0  0  0   3
   N    0  1  0  0  1  0  0  0  0  0  0  0  0  0  0  1  0  0  0  0  0  0   3
   O    0  0  0  0  0  0  0  1  0  1  1  0  0  0  0  0  0  0  1  0  0  0   4
   P    0  0  1  1  0  0  0  0  0  0  0  0  1  1  0  0  0  0  0  0  0  0   4
   Q    0  0  0  0  0  0  0  0  1  0  0  1  0  0  1  0  0  0  0  1  0  0   4
   Sum  3  3  3  3  3  3  3  3  3  3  3  3  3  3  3  3  3  3  3  3  3  3  66

Note: Judges close together in rN will draw similar case loads.

Upvotes: 4

josliber
josliber

Reputation: 44340

Usually when I see "random assignment subject to constraints" questions, my mind goes to the following idea:

  1. Select a random weight for assigning item i to category j (in this case assigning case i to judge j)
  2. Use linear programming to identify the assignments that satisfy all constraints (<= 4 cases/judge and 3 reviews per case) with maximum weight.

This is pretty straightforward in R with a linear programming package like lpSolve, creating a binary variable x_ij that indicates whether we assign case i to judge j for every case/judge pair:

library(lpSolve)
set.seed(144)
# vars is a convenience matrix that tells us the i and j index of each variable in our model
vars <- expand.grid(i=M, j=N)
mod <- lp(direction = "max",
          objective.in = rnorm(nrow(vars)),
          const.mat = rbind(t(sapply(M, function(i) as.numeric(vars$i == i))),
                            t(sapply(N, function(j) as.numeric(vars$j == j)))),
          const.dir = rep(c("=", "<="), c(length(M), length(N))),
          const.rhs = rep(c(3, 4), c(length(M), length(N))),
          all.bin = TRUE)

# Extract all cases assigned to each judge
sapply(N, function(j) vars$i[mod$solution > 0.999 & vars$j == j])
# $A
# [1]  2 10 15
# 
# $B
# [1]  7  8 13 22
# 
# $C
# [1] 2 3 7 9
# ...

By the way we've setup the weights and constraints, this can really be thought of as randomly selecting from all feasible assignments of cases to judges.

Upvotes: 6

Related Questions