Ber
Ber

Reputation: 53

dplyr: Iterative calculation

I'm trying to figure out if there's a way for dplyr to calculate a variable row-by-row, such that it can reference the results calculated one record prior.

Here is code that achieves what I want using for-loops:

x <- data.frame(x1 = c(1:10)) 

#This works.
x$x2[1] <- 0

for (i in 2:nrow(x)) {
  x$x2[i] <- x$x2[i-1]*1.1 + 1
}

My naive dplyr attempt, which doesn't work:

#This doesn't work. "Error: object'x1' not found"
x %>% mutate(x2 = ifelse(x1 == 1, 0, lag(x2)*1.1 + 1))

It would be nice to find a dplyr solution since this step is part of a workflow that heavily relies on it.

Thank you.


Edit:

The above is a simplified example of what I'm trying to do. A closed form solution will not work because the function applied is more complex and dynamic than what is shown here. For example, suppose that 'add_var' and 'pwr_var' are random integers, and I want to calculate this:

x$x2[1] <- 0

for (i in 2:nrow(x)) {
  x$x2[i] <- ( x$x2[i-1]*1.1 + x$add_var[i] ) ^ x$pwr_var[i]
}

Upvotes: 4

Views: 1524

Answers (3)

MrFlick
MrFlick

Reputation: 206197

In general, if you want to calculate values that rely on previous values, you are better off using Reduce. here's an example with your data

x %>% mutate(x3 = Reduce(function(a,b) a*1.1+1, 1:(n()-1), 0, acc=T))

But in your example, there is a closed form for the term that doesn't rely on iteration. You can do

x %>% mutate(x4=(1.1^(row_number()-1)-1)/(1.1-1)*1)

Upvotes: 4

polka
polka

Reputation: 1529

If you really want to use the expanded notation, then you can use the library magrittr, define a function that performs your transformation, and then apply the pipe operators. Also, use the data_frame object, not the data.frame object for dplyr.

    library(dplyr)
    library(magrittr)
    x <- data_frame(x1 = c(1:10))
    f_x <- function(x){(x-1)*1.1+1}
    x$x2 <-x %$% x1 %>% f_x

Upvotes: 0

alingir
alingir

Reputation: 1

Your code works for me. Here is the result:

   x1        x2
1   1  0.000000
2   2  1.000000
3   3  2.100000
4   4  3.310000
5   5  4.641000
6   6  6.105100
7   7  7.715610
8   8  9.487171
9   9 11.435888
10 10 13.579477

Can you try dplyr code line as:

 x %>% mutate(x2 = ifelse(x1 == 1, 0, lag(x2)*1.1 + 1))

Upvotes: -3

Related Questions