DaniCee
DaniCee

Reputation: 3207

Make 2 subset vectors so that values are different index-wise

I want to make 2 vectors subsetting from the same data, with replace=TRUE.

Even if both vectors can contain the same values, they cannot be the same at the same index position.

For example:

> set.seed(1)
> a <- sample(15, 10, replace=T)
> b <- sample(15, 10, replace=T)
> a
 [1]  4  6  9 14  4 14 15 10 10  1
> b
 [1]  4  3 11  6 12  8 11 15  6 12
> a==b
 [1]  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE

In this case, vectors a and b contain the same value at index 1 (value==4), which is wrong for my purposes.

Is there an easy way to correct this?

And can it be done on the subset step?

Or should I go through a loop checking element by element and if the values are identical, make another selection for b[i] and check again if it's not identical ad infinitum?

many thanks!

Upvotes: 4

Views: 79

Answers (2)

fmarm
fmarm

Reputation: 4284

My idea is, instead of getting 2 samples of length 10 with replacement, get 10 samples of length 2 without replacement

 library(purrr)
 l <- rerun(10,sample(15,2,replace=FALSE))

Each element in l is a vector of integers of length two. Those two integers are guaranteed to be different because we specified replace=FALSE in sample

 # from l extract all first element in each element, this is a
 a <- map_int(l,`[[`,1)
 # from list extract all second elements, this is b
 b <- map_int(l,`[[`,2)

Upvotes: 6

Maurits Evers
Maurits Evers

Reputation: 50678

How about a two-stage sampling process

set.seed(1)
x <- 1:15
a <- sample(x, 10, replace = TRUE)
b <- sapply(a, function(v) sample(x[x != v], 1))
a != b
#[1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE

We first draw samples a; then for every sample from a, we draw a new sample from the set of values x excluding the current sample from a. Since we're doing this one-sample-at-a-time, we automatically allow for sampling with replacement.

Upvotes: 3

Related Questions