German Demidov
German Demidov

Reputation: 121

Fastest implementation of Maximum Subarray Sum in R

Dear community members,

I use a function of Maximum Subarray Sum (Kadane algorithm) very often in my tool. It is a bottleneck.

Could you tell me if the performance can be improved using some internal R tricks? It works faster in C++ - but I do not really want to add C++ code to my R-based tool...

maxSubArraySum <- function(x){
  bestSoFar = 0
  bestNow = 0
  bestStartIndexSoFar = -1
  bestStopIndexSoFar = -1
  bestStartIndexNow = -1
  for (i in 1:length(x)) {
    value = bestNow + x[i]
    if (value > 0) {
      if (bestNow == 0) {
        bestStartIndexNow = i
      }
      bestNow = value
    }
    else
      bestNow = 0
    
    if (bestNow > bestSoFar) {
      bestSoFar = bestNow
      bestStopIndexSoFar = i
      bestStartIndexSoFar = bestStartIndexNow
    }
  }
  return(c(bestSoFar, bestStartIndexSoFar, bestStopIndexSoFar))
}

Upvotes: 2

Views: 536

Answers (1)

J&#233;r&#244;me Richard
J&#233;r&#244;me Richard

Reputation: 50403

If you do not want to use Rcpp, here is a faster plain-R vectorized implementation:

fast_maxSubArraySum <- function(x) {
    csum = cumsum(x)
    cmin = cummin(csum)
    gaps = csum - cmin
    endPos = which.max(gaps)
    integralMax = gaps[endPos]
    startPos = which.min(cmin[1:endPos])+1
    integralMin = gaps[startPos-1]
    gap = integralMax - integralMin
    if(cmin[startPos] > 0) {
        startPos = 1
        gap = csum[endPos]
    }
    return(c(gap, startPos, endPos))
}

This version is more than 10 times faster than maxSubArraySum on my machine. Note that the same strategy can also be used to write an even faster C++ version.

Upvotes: 2

Related Questions