Reputation: 53
I'm trying to figure out if there's a way for dplyr to calculate a variable row-by-row, such that it can reference the results calculated one record prior.
Here is code that achieves what I want using for-loops:
x <- data.frame(x1 = c(1:10))
#This works.
x$x2[1] <- 0
for (i in 2:nrow(x)) {
x$x2[i] <- x$x2[i-1]*1.1 + 1
}
My naive dplyr attempt, which doesn't work:
#This doesn't work. "Error: object'x1' not found"
x %>% mutate(x2 = ifelse(x1 == 1, 0, lag(x2)*1.1 + 1))
It would be nice to find a dplyr solution since this step is part of a workflow that heavily relies on it.
Thank you.
Edit:
The above is a simplified example of what I'm trying to do. A closed form solution will not work because the function applied is more complex and dynamic than what is shown here. For example, suppose that 'add_var' and 'pwr_var' are random integers, and I want to calculate this:
x$x2[1] <- 0
for (i in 2:nrow(x)) {
x$x2[i] <- ( x$x2[i-1]*1.1 + x$add_var[i] ) ^ x$pwr_var[i]
}
Upvotes: 4
Views: 1524
Reputation: 206197
In general, if you want to calculate values that rely on previous values, you are better off using Reduce
. here's an example with your data
x %>% mutate(x3 = Reduce(function(a,b) a*1.1+1, 1:(n()-1), 0, acc=T))
But in your example, there is a closed form for the term that doesn't rely on iteration. You can do
x %>% mutate(x4=(1.1^(row_number()-1)-1)/(1.1-1)*1)
Upvotes: 4
Reputation: 1529
If you really want to use the expanded notation, then you can use the library magrittr, define a function that performs your transformation, and then apply the pipe operators. Also, use the data_frame object, not the data.frame object for dplyr.
library(dplyr)
library(magrittr)
x <- data_frame(x1 = c(1:10))
f_x <- function(x){(x-1)*1.1+1}
x$x2 <-x %$% x1 %>% f_x
Upvotes: 0
Reputation: 1
Your code works for me. Here is the result:
x1 x2
1 1 0.000000
2 2 1.000000
3 3 2.100000
4 4 3.310000
5 5 4.641000
6 6 6.105100
7 7 7.715610
8 8 9.487171
9 9 11.435888
10 10 13.579477
Can you try dplyr code line as:
x %>% mutate(x2 = ifelse(x1 == 1, 0, lag(x2)*1.1 + 1))
Upvotes: -3