d.b
d.b

Reputation: 32548

sample unique pairs from two vectors

Given are two vectors, a and b

a = letters[1:6]
b = letters[7:11]

The goal is to sample a two column matrix using a and b. The first column should contain elements from a such that each element of a is repeated two times. The second column should contain elements from b such that each element of b is also repeated at least two times. One more condition is that the pairs have to be unique.

I have figured out how to sample the 12 pairs but have not figured out how I can ensure they will always be unique. For example, in the solution presented below, row 3 and row 11 are the same.

The desired output should have no duplicate rows.

set.seed(42)
m = cbind(sample(c(a, a)), sample(c(b, b, sample(b, 2, replace = TRUE))))
m
#      [,1] [,2]
# [1,] "e"  "g" 
# [2,] "f"  "k" 
# [3,] "c"  "k" 
# [4,] "b"  "h" 
# [5,] "f"  "j" 
# [6,] "d"  "i" 
# [7,] "e"  "h" 
# [8,] "a"  "g" 
# [9,] "d"  "h" 
#[10,] "a"  "i" 
#[11,] "c"  "k" 
#[12,] "b"  "j" 

Upvotes: 2

Views: 979

Answers (3)

sirallen
sirallen

Reputation: 1966

Another way that doesn't require replacement

m = rbind(
  c(1,1,0,0,0),
  c(1,1,0,0,0),
  c(0,0,1,1,0),
  c(0,0,1,1,0),
  c(0,0,0,0,1),
  c(0,0,0,0,1)
)

# One "free" selection in each of the last two rows
m[5, sample(4,1)] = 1
m[6, sample(4,1)] = 1

# Scramble it while preserving row/column sums
m = m[sample(6), sample(5)]

> as.matrix(expand.grid(a=a,b=b))[as.logical(m),]

#      a   b  
# [1,] "a" "g"
# [2,] "b" "g"
# [3,] "e" "g"
# [4,] "c" "h"
# [5,] "d" "h"
# [6,] "f" "h"
# [7,] "d" "i"
# [8,] "f" "i"
# [9,] "b" "j"
#[10,] "c" "j"
#[11,] "a" "k"
#[12,] "e" "k"

Upvotes: 2

Sotos
Sotos

Reputation: 51582

You can make it a function and throw replace in there, i.e.

f1 <- function(a, b){
  m <- cbind(sample(c(a, a)), sample(c(b, b, sample(b, 2, replace = TRUE))))
  m[,2] <-replace(m[,2], duplicated(m), sample(b[!b %in% m[duplicated(m),2]], 1))
  return(m)
}

#which seems stable
sum(duplicated(f1(a, b)))
#[1] 0
sum(duplicated(f1(a, b)))
#[1] 0
sum(duplicated(f1(a, b)))
#[1] 0
sum(duplicated(f1(a, b)))
#[1] 0

Upvotes: 2

Matt Tyers
Matt Tyers

Reputation: 2215

Definitely not elegant, but would work.

a = letters[1:6]
b = letters[7:11]

asamp <- sample(c(a,a))
finished <- F
while(!finished) {
  bsamp <- sample(c(b, b, sample(b, 2, replace = TRUE)))
  if(length(unique(paste(asamp,bsamp)))==12) finished <- T
}
cbind(asamp,bsamp)

Upvotes: 1

Related Questions