Reputation: 324
Is it possible to perform a sampling without replacement? Something like this (it does not work as desired):
x <- rle(c(1,1,1,1,1,2,2))
result <- sample(x, size=2, replace=F)
print(inverse.rle(result))
# prints [1] 1 1 1 1 1 1 2 2
# a desired result [1] 1 1
In other words, I'd like to have the following working but with running length encoding:
set.seed(2)
x <- c(1,1,1,1,1,2,2)
result <- sample(x, size=2, replace=F)
print(result)
# prints [1] 1 2
Upvotes: 1
Views: 111
Reputation: 31
The sampling works out of the box if instead of rle()
you use the S4Vectors function Rle()
.
x <- Rle(c(1,1,1,1,1,2,2))
# numeric-Rle of length 7 with 2 runs
# Lengths: 5 2
# Values : 1 2
xs <- sample(x, 4, replace=F)
# numeric-Rle of length 4 with 2 runs
# Lengths: 2 2
# Values : 2 1
xs2 <- Rle(sort(xs))
# numeric-Rle of length 4 with 2 runs
# Lengths: 2 2
# Values : 1 2
as.vector(xs2)
# [1] 1 1 2 2
Upvotes: 0
Reputation: 2767
Here's a function to do it. You probably need some big numbers to make this worthwhile over just expanding out the rle explicitly.
x <- rle(c(1,1,1,1,1,2,2))
sample_rle <- function(x, ...) {
x$values[1+findInterval(
sample(sum(x$lengths), ...),
cumsum(x$lengths),
left.open=TRUE)]
}
sample_rle(x, size = 2, replace = FALSE)
#> [1] 2 1
sample_rle(x, size = 7, replace = FALSE)
#> [1] 2 1 2 1 1 1 1
Upvotes: 1