Reputation: 121
Dear community members,
I use a function of Maximum Subarray Sum (Kadane algorithm) very often in my tool. It is a bottleneck.
Could you tell me if the performance can be improved using some internal R tricks? It works faster in C++ - but I do not really want to add C++ code to my R-based tool...
maxSubArraySum <- function(x){
bestSoFar = 0
bestNow = 0
bestStartIndexSoFar = -1
bestStopIndexSoFar = -1
bestStartIndexNow = -1
for (i in 1:length(x)) {
value = bestNow + x[i]
if (value > 0) {
if (bestNow == 0) {
bestStartIndexNow = i
}
bestNow = value
}
else
bestNow = 0
if (bestNow > bestSoFar) {
bestSoFar = bestNow
bestStopIndexSoFar = i
bestStartIndexSoFar = bestStartIndexNow
}
}
return(c(bestSoFar, bestStartIndexSoFar, bestStopIndexSoFar))
}
Upvotes: 2
Views: 536
Reputation: 50403
If you do not want to use Rcpp, here is a faster plain-R vectorized implementation:
fast_maxSubArraySum <- function(x) {
csum = cumsum(x)
cmin = cummin(csum)
gaps = csum - cmin
endPos = which.max(gaps)
integralMax = gaps[endPos]
startPos = which.min(cmin[1:endPos])+1
integralMin = gaps[startPos-1]
gap = integralMax - integralMin
if(cmin[startPos] > 0) {
startPos = 1
gap = csum[endPos]
}
return(c(gap, startPos, endPos))
}
This version is more than 10 times faster than maxSubArraySum
on my machine.
Note that the same strategy can also be used to write an even faster C++ version.
Upvotes: 2