Stromberg
Stromberg

Reputation: 105

Vectorizing R-loop for better performance

I have a problem to find a vectorization representation for a specific loop in R. My objective is to enhance the performance of the loop, because it has to be run thousands of times in the algorithm.

I want to find the position of the lowest value in a particular array section defined by a vector 'Level' for each row.

Example:

Level = c(2,3)

Let first row of array X be: c(2, -1, 3, 0.5, 4).

Searching for the position of the lowest value in the range 1:Level[1] of the row (that is (2, -1)), I get a 2, because -1 < 2 and -1 stands on second position of the row. Then, searching the position of the lowest value in the second range (Level[1]+1):(Level[1]+Level[2]) (that is (3, 0.5, 4)), I get a 4, because 0.5 < 3 < 4 and 0.5 stands on fourth position of the row.

I have to perform this over each row in the array.

My solution to the problem works as follows:

Level = c(2,3,3)  #elements per section, here: 3 sections with 2,3 and 3 levels
rows = 10  #number of rows in array X
X = matrix(runif(rows*sum(Level),-5,5),rows,sum(Level))  #array with 10 rows and sum(Level) columns, here: 8
Position_min = matrix(0,rows,length(Level))  #array in which the position of minimum values for each section and row are stored
for(i in 1:rows){
 for(j in 1:length(Level)){            #length(Level) is number of intervals, here: 3
  if(j == 1){coeff=0}else{coeff=1}
  Position_min[i,j] = coeff*sum(Level[1:(j-1)]) + which(X[i,(coeff*sum(Level[1:(j-1)])+1):sum(Level[1:j])] == min(X[i,(coeff*sum(Level[1:(j-1)])+1):sum(Level[1:j])]))
  }
}

It works fine but I would prefer a solution with better performance. Any ideas?

Upvotes: 3

Views: 193

Answers (2)

Ken Benoit
Ken Benoit

Reputation: 14902

Here is a "fully vectorized" solution with no explicit loops:

findmins <- function(x, level) {
    series <- rep(1:length(Level), Level)
    x <- split(x, factor(series))
    minsSplit <- as.numeric(sapply(x, which.min))
    minsSplit + c(0, cumsum(level[-length(level)]))
}

Position_min_vectorized <- t(apply(X, 1, findmins, Level))
identical(Position_min, Position_min_vectorized)
## [1] TRUE

You can get better performance by making your matrix into a list, and then using parallel's mclapply():

X_list <- split(X, factor(1:nrow(X)))
do.call(rbind, parallel::mclapply(X_list, findmins, Level))
##    [,1] [,2] [,3]
## 1     1    5    6
## 2     2    3    6
## 3     1    4    7
## 4     1    5    6
## 5     2    5    7
## 6     2    4    6
## 7     1    5    8
## 8     1    5    8
## 9     1    3    8
## 10    1    3    8

Upvotes: 3

Joswin K J
Joswin K J

Reputation: 710

This will remove the outer level of the loop:

Level1=c(0,cumsum(Level))
for(j in 1:(length(Level1)-1)){
    Position_min[,j]=max.col(-X[,(Level1[j]+1):Level1[j+1]])+(Level1[j])
}

Upvotes: 3

Related Questions