Reputation: 3207
I want to make 2 vectors subsetting from the same data, with replace=TRUE
.
Even if both vectors can contain the same values, they cannot be the same at the same index position.
For example:
> set.seed(1)
> a <- sample(15, 10, replace=T)
> b <- sample(15, 10, replace=T)
> a
[1] 4 6 9 14 4 14 15 10 10 1
> b
[1] 4 3 11 6 12 8 11 15 6 12
> a==b
[1] TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
In this case, vectors a
and b
contain the same value at index 1 (value==4), which is wrong for my purposes.
Is there an easy way to correct this?
And can it be done on the subset
step?
Or should I go through a loop checking element by element and if the values are identical, make another selection for b[i]
and check again if it's not identical ad infinitum?
many thanks!
Upvotes: 4
Views: 79
Reputation: 4284
My idea is, instead of getting 2 samples of length 10 with replacement, get 10 samples of length 2 without replacement
library(purrr)
l <- rerun(10,sample(15,2,replace=FALSE))
Each element in l
is a vector of integers of length two. Those two integers are guaranteed to be different because we specified replace=FALSE
in sample
# from l extract all first element in each element, this is a
a <- map_int(l,`[[`,1)
# from list extract all second elements, this is b
b <- map_int(l,`[[`,2)
Upvotes: 6
Reputation: 50678
How about a two-stage sampling process
set.seed(1)
x <- 1:15
a <- sample(x, 10, replace = TRUE)
b <- sapply(a, function(v) sample(x[x != v], 1))
a != b
#[1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
We first draw samples a
; then for every sample from a
, we draw a new sample from the set of values x
excluding the current sample from a
. Since we're doing this one-sample-at-a-time, we automatically allow for sampling with replacement.
Upvotes: 3