Iterative calculation in dplyr using result of previous calculation

Question

I am looking to perform a calculation on a field in a dataframe with the following logic:

If the basevalue != NA, assign the basevalue to the result
If the basevalue == NA, take the previous result, multiply it by the multiplier field and output that as the result.

Assume that the first value is never NA so there would always be a seed value. I would wish to perform the calculation by groups of data (dplyr::group_by)

The following code gives a reprex:

basevalue <- c(2,5,NA,NA,NA,NA)      
multiplier <- c(3.2,1.1,1.8,1.3,1.5,1.2)
previous_result <- c(NA,2,5,9,11.7,17.55)
result<- c(2,5,9,11.7,17.55,21.06)
logic <- c(rep("basevalue != NA, so take base value",2), rep("basevalue == NA, so take lag(result) * multiplier",4))

dfIn <- data.frame(basevalue,multiplier)
dfOut <- data.frame(basevalue,multiplier, result, previous_result, logic)

Is there a way to achieve this using simple dplyr / base R / tidyverse logic, or do I need to use a specialist package such as zoo?

David Robinson · Accepted Answer

You can do this with the accumulate2 function from purrr, which is designed for applying this kind of recursive relationship across two vectors.

library(dplyr)
library(purrr)

calculate <- function(previous, basevalue, multiplier) {
  coalesce(basevalue, multiplier * previous)
}

dfIn %>%
  mutate(lst = accumulate2(basevalue, multiplier[-1], calculate),
         result = unlist(lst))

Two notes:

The multiplier[-1] throws away the first multiplier value, since accumulate expects that to be one shorter than the first argument (notice that you'll never use the first multiplier value since there's no "previous" value at that point).
The result of accumulate2 is a list, so we add the unlist() to turn it into a vector.

Iterative calculation in dplyr using result of previous calculation

Answers (2)

Related Questions