Reputation: 5272
I'd like to get the index of the minimum of some subset of a vector, but the index in the original vector, not the renumbered subset.
As for now, I've been using :
L = rnorm(20) # say this is the original vector
subset = runif(20)<0.3 # some conditions to extract the subset
ind_min = which.min(L[subset])
ind_sel = seq(L)[subset]
ind_min = ind_sel[ind_min]
but I guess there should be something more direct or cleaner. I've been thinking of using a trick such as :
L_tmp = L
L_tmp[!subset] = Inf
ind_min = which.min(L_tmp)
which is apparentlty more efficient :
> microbenchmark(method_1(), method_2(), unit = "relative")
Unit: relative
expr min lq mean median uq max neval
method_1() 3.699562 3.249635 3.119666 3.076819 2.928259 3.225849 100
method_2() 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 100
but I'm not really happy with it because I guess there should be something else. Any suggestion ?
Upvotes: 2
Views: 1028
Reputation: 24074
you can try:
(seq(L))[subset][which.min(L[subset])]
which is similar to your first method but without creating temporary variables
benchmark result on a 20000 long vector L
:
method_cath<- function(){(seq(L))[subset][which.min(L[subset])]}
method_FK_corr1 <- function(){min = min(L[subset])
ind_min = intersect(which(L == min), seq(L)[subset])[1]
return(ind_min)}
method_FK_corr2 <- function(){min = min(L[subset])
ind_min = intersect(which(L == min), which(subset))[1]
return(ind_min)}
method_1clm <- function(){ind_min = which.min(L[subset])
ind_sel = seq(L)[subset]
ind_min = ind_sel[ind_min]
return(ind_min)}
method_2clm <- function(){L_tmp = L
L_tmp[!subset] = Inf
ind_min = which.min(L_tmp)
return(ind_min)}
> microbenchmark(method_2clm(), method_cath(), method_1clm(), method_FK_corr2(), method_FK_corr1(), unit = "relative")
# Unit: relative
# expr min lq mean median uq max neval cld
# method_2clm() 1.000000 1.000000 1.000000 1.000000 1.000000 1.0000000 100 a
# method_cath() 1.312146 1.290370 1.282964 1.278178 1.282424 0.9191693 100 b
# method_1clm() 1.295031 1.294642 1.303781 1.284630 1.279821 1.2977193 100 b
# method_FK_corr2() 1.185821 1.166924 1.278030 1.155217 1.165738 4.9948007 100 b
# method_FK_corr1() 1.683783 1.644797 1.746055 1.635293 1.636195 5.1616672 100 c
NB: I was getting NA
as a result with @FedorenkoKristina original function, I tested 2 possible corrected functions, now all functions give the same result.
Upvotes: 5
Reputation: 2777
You can find min in the L[subset]
. And then get the index in the L.
L = rnorm(20) # say this is the original vector
subset = runif(20)<0.3 # some conditions to extract the subset
min = min(L[subset])
ind_min = intersect(which(L == min), seq(L)[subset])[1]
Upvotes: 2
Reputation: 616
You could also use subset perhaps. Where subset is your condition.
L = rnorm(20) # say this is the original vector
subset = runif(20)<0.3 # some conditions to extract the subset
ind_min = which(L == min(subset(L,subset)))
I guess this is very similar to what Fedorenko Kristina suggested. She was faster than me.
Upvotes: 0