JennyD
JennyD

Reputation: 93

Referring to previous row in calculation

I'm new to R and can't seem to get to grips with how to call a previous value of "self", in this case previous "b" b[-1].

b <- ( ( 1 / 14 ) * MyData$High + (( 13 / 14 )*b[-1]))

Obviously I need a NA somewhere in there for the first calculation, but I just couldn't figure this out on my own.

Adding example of what the sought after result should be (A=MyData$High):

  A  b
1 5  NA
2 10 0.7142...
3 15 3.0393...
4 20 4.6079...

Upvotes: 8

Views: 9448

Answers (2)

G. Grothendieck
G. Grothendieck

Reputation: 269431

1) for loop Normally one would just use a simple loop for this:

MyData <- data.frame(A = c(5, 10, 15, 20))


MyData$b <- 0
n <- nrow(MyData)
if (n > 1) for(i in 2:n) MyData$b[i] <- ( MyData$A[i] + 13 * MyData$b[i-1] )/ 14
MyData$b[1] <- NA

giving:

> MyData
   A         b
1  5        NA
2 10 0.7142857
3 15 1.7346939
4 20 3.0393586

2) Reduce It would also be possible to use Reduce. One first defines a function f that carries out the body of the loop and then we have Reduce invoke it repeatedly like this:

f <- function(b, A) (A + 13 * b) / 14
MyData$b <- Reduce(f, MyData$A[-1], 0, acc = TRUE)
MyData$b[1] <- NA

giving the same result.

This gives the appearance of being vectorized but in fact if you look at the source of Reduce it does a for loop itself.

3) filter Noting that the form of the problem is a recursive filter with coefficient 13/14 operating on A/14 (but with A[1] replaced with 0) we can write the following. Since filter returns a time series we use c(...) to convert it back to an ordinary vector. This approach actually is vectorized as the filter operation is performed in C.

MyData$b <- c(filter(replace(MyData$A, 1, 0)/14, 13/14, method = "recursive"))
MyData$b[1] <- NA

again giving the same result.

Note: All solutions assume that MyData has at least 1 row.

Upvotes: 5

NGaffney
NGaffney

Reputation: 1532

There are a couple of ways you could do this.

The first method is a simple loop

df <- data.frame(A = seq(5, 25, 5))
df$b <- 0

for(i in 2:nrow(df)){
  df$b[i] <- (1/14)*df$A[i]+(13/14)*df$b[i-1]
}

df
A         b
1  5 0.0000000
2 10 0.7142857
3 15 1.7346939
4 20 3.0393586
5 25 4.6079758

This doesn't give the exact values given in the expected answer, but it's close enough that I've assumed you made a transcription mistake. Note that we have to assume that we can take the NA in df$b[1] as being zero or we get NA all the way down.

If you have heaps of data or need to do this a bunch of time the speed could be improved by implementing the code in C++ and calling it from R.

The second method uses the R function sapply

The form you present the problem in

b_i = (1/14)A_i + b_{i-1}

is recursive, which makes it impossible to vectorise, however we can do some maths and find that it is equivalent to

b_i=\frac{1}{14}\sum_{j=1}^{j=i}{\left( \frac{13}{14}\right)^{(i-j)}A_j}

We can then write a function which calculates b_i and use sapply to calculate each element

calc_b <- function(n,A){
  (1/14)*sum((13/14)^(n-1:n)*A[1:n])
}

df2 <- data.frame(A = seq(10,25,5))
df2$b <- sapply(seq_along(df2$A), calc_b, df2$A)
df2
A         b
1 10 0.7142857
2 15 1.7346939
3 20 3.0393586
4 25 4.6079758

Note: We need to drop the first row (where A = 5) in order for the calculation to perform correctly.

Upvotes: 3

Related Questions