Reputation: 1284
I have a vector of numbers and I would like to select some of them at random but in order. How could I do it?
For example:
vector <- runif(10, min=0, max=101)
vector
[1] 35.956732 67.608039 20.099881 23.184217 9.157408 34.105185 97.459770 25.805254 74.537667 18.865662
Which code can I use to create a new vector containing, for example, four out of the 10 values with the requirement that those four values are in the same order than the original vector? That is, the vector can not be 9.157408 67.608039 74.537667 97.459770
but 67.608039 9.157408 97.459770 74.537667
.
Any help would be great. Thanks in advance.
What if I want a certain number of steps among consecutive selected values?
That is, if I have for instance this vector:
[1] 2.1 3.4 1.6 8.9 2.3 5.4 6.4 1.3 10.8 3.7 13.4 2.4 5.4 6.8
How can I select 3 out of that 14 values with the additional condition that there has to be at least 3 non-selected values between two selected ones. For example, a selected vector could be 2.1 5.4 6.8
but it couldn't be 1.6 5.4 10.8
.
Upvotes: 3
Views: 491
Reputation: 102251
Try sample
like
vector[sort(sample(length(vector),4))]
or
vector[head(which(sample(c(TRUE,FALSE),length(vector),replace = TRUE)),4)]
Update
If you have constraints regarding the minimum spacing between random indices, you can try the code below:
f1 <- function(vec,n, min_spacing = 4) {
idx <- seq_along(vec)
repeat {
k <- sort(sample(idx,n))
if (all(diff(k)>=min_spacing)) break
}
vec[k]
}
f2 <- function(vec, n, min_spacing = 4) {
u <- unname(tapply(vec, ceiling(seq_along(vec) / min_spacing), sample, size = 1))
head(u[seq(1, length(u), by = 2)], n)
}
Upvotes: 1
Reputation: 887511
We can sample
4 elements from the vector
, then match
to get the index and subset the vector
v1 <- sample(vector, 4)
vector[match(v1, vector)]
If we need to sample
1 element every 4, we could use rollapply
by specifying the width
and by
library(zoo)
rollapply(v2, 4, by = 4, FUN = function(x) sample(x, 1))
#[1] 1.6 1.3 2.4
Or use a loop
out <- c()
flag <- TRUE
i <- 1
while(flag) {
if((i + 4) > length(v2)) {
break
flag <- FALSE
}
i1 <- i:(i + 2)
tmp <- sample(i1, 1)
out <- c(out, tmp)
i <- tmp + 3
}
out
#[1] 3 7 11
v2 <- c(2.1, 3.4, 1.6, 8.9, 2.3, 5.4, 6.4, 1.3, 10.8, 3.7, 13.4, 2.4,
5.4, 6.8)
Upvotes: 2
Reputation: 26690
One option is to use the createDataPartition()
function from the caret package, e.g.
library(caret)
vector <- runif(10, min=0, max=101)
vector
#>[1] 49.12759 37.39169 99.31837 39.22023 23.15373 62.95305 13.79056 97.71442
#>[9] 52.02225 16.47010
sampling_index <- createDataPartition(y = vector, times = 1,
p = 0.3, list = FALSE)
vector[sampling_index]
#>[1] 49.12759 39.22023 23.15373 97.71442
Upvotes: 2
Reputation: 992
Is this what you're looking for? Simply use the sort
function to put in order.
vector <- runif(10, min=0, max=101)
n <- 5
i <- sort(sample(seq_along(vector),n))
vector[i]
Upvotes: 2