Reputation: 871
I would like to know how to split a vector by a percentage. I tried to use the stats :: quantile function but it doesn't manage to separate correctly when there are several times the same values. I would like a method that does the split only by taking into account the length of the vector without taking into account the values.
vector <- c(1,1,1, 4:10)
minProb <- 0.1
maxProb <- 0.9
l <- length(vector)
dt <- data.frame("id" = 1:l, "value" = vector)
dt <- dt %>% arrange(act)
#min <- l*minProb
#max <- l*maxProb
#data1 <- dt$id[min:max]
#data2 <- dt$id[-c(min:max)]
#q <- quantile(dt$act, probs=c(minProb,maxProb))
#w <- which(dt$act >= q[1] & dt$act <= q[2])
> g2
1 10
> g1
2 3 4 5 6 7 8 9
Upvotes: 1
Views: 400
Reputation: 76673
The following does split the vector, whether that's what the question asks for is not clear.
l <- length(vector)
qq <- quantile(seq_along(vector), probs = c(minProb, maxProb))
f <- logical(l)
f[round(qq[1])] <- TRUE
f[round(qq[2])] <- TRUE
split(vector, cumsum(f))
#$`0`
#[1] 1
#
#$`1`
#[1] 1 1 4 5 6 7 8
#
#$`2`
#[1] 9 10
In order to have the indices, like it is asked in a comment, do
split(seq_along(vector), cumsum(f))
Upvotes: 1