HCAI
HCAI

Reputation: 2263

R: Recursively add rows

The concentration of germs of hands following j surface contacts can be dictated by the following recursive relationship:

H[j+1]=H[j]+T[j]*(S[j]-H[j])

Where S is the surface concentration the hand touches (and is assumed random for ease). T is the transfer efficiency for each contact. I would like to calculate the eventual hand concentration (with zero starting concentration).

I have a data frame that has a vector of surface contacts and transfer efficiencies for each surface. I have two groups a & b and within each group assume I will touch each one sequentially 1:length(df):

 df <- data.frame(S = runif(10)*100, T = runif(10),g=rep(c("a","b"),each=5))

I would like to compute the cumulative sum of H by group using dplyr where possible.

a special case:

If g = "a", the starting value of H is 0. If g=="b" then the starting value of H is the last value from when g=="a"

Upvotes: 3

Views: 378

Answers (3)

Anoushiravan R
Anoushiravan R

Reputation: 21928

Here is another generalized version I would use for this question:

df$H <- Reduce(function(x, y) {
  x + df$T[y] * (df$g[y] == df$g[y + 1]) * (df$S[y] - x) 
}, init = 0,
seq_len(nrow(df))[-nrow(df)], accumulate = TRUE)

df

           S         T g        H
1  37.698250 0.8550377 a  0.00000
2   3.843585 0.4722659 a 32.23342
3  33.150788 0.3684791 a 18.82587
4   8.948116 0.8893603 a 24.10430
5  57.061844 0.5452377 a 10.62499
6  49.648827 0.7719067 b 10.62499
7  95.403697 0.5835950 b 40.74775
8  10.598677 0.1220491 b 72.64469
9  91.913365 0.2166443 b 65.07203
10 69.644200 0.2603413 b 70.88705

Upvotes: 3

AnilGoyal
AnilGoyal

Reputation: 26218

For the sake of completeness and taking clues from Arun and Onyambu (on a separate question), I am adding baseR answer here too.

transform(df, H = Reduce(function(.x, .y) .x + df$T[.y] * (df$S[.y] - .x) * !c(!duplicated(df$g)[-1], 0)[.y],
                         seq(nrow(df)),
                         init = 0,
                         accumulate = TRUE)[-(1 + nrow(df))])

           S         T g        H
1  37.698250 0.8550377 a  0.00000
2   3.843585 0.4722659 a 32.23342
3  33.150788 0.3684791 a 18.82587
4   8.948116 0.8893603 a 24.10430
5  57.061844 0.5452377 a 10.62499
6  49.648827 0.7719067 b 10.62499
7  95.403697 0.5835950 b 40.74775
8  10.598677 0.1220491 b 72.64469
9  91.913365 0.2166443 b 65.07203
10 69.644200 0.2603413 b 70.88705

Earlier Answer A slight variation of my friend's answer above, I hope that may serve your purpose. Only assumption I am having is that your data is sorted by groups already and a precedes b (exactly as shown in sample). Since you have not given the random seed, I am also taking the same data took by my friend.

  • Strategy/hack, I used 0 value of T inside accumulate2 argument so that last value of H in group a is repeated in first value of group b
library(tidyverse)

df <- read.table(header = TRUE, text = '           S         T g
1  37.698250 0.8550377 a
2   3.843585 0.4722659 a
3  33.150788 0.3684791 a
4   8.948116 0.8893603 a
5  57.061844 0.5452377 a
6  49.648827 0.7719067 b
7  95.403697 0.5835950 b
8  10.598677 0.1220491 b
9  91.913365 0.2166443 b
10 69.644200 0.2603413 b')

df %>%
  mutate(H = accumulate2(S, replace(T, length(g[g=='a']), 0), .init = 0, ~ ..1 + ..3 * (..2 - ..1))[-(1+n())])


           S         T g        H
1  37.698250 0.8550377 a  0.00000
2   3.843585 0.4722659 a 32.23342
3  33.150788 0.3684791 a 18.82587
4   8.948116 0.8893603 a 24.10430
5  57.061844 0.5452377 a 10.62499
6  49.648827 0.7719067 b 10.62499
7  95.403697 0.5835950 b 40.74775
8  10.598677 0.1220491 b 72.64469
9  91.913365 0.2166443 b 65.07203
10 69.644200 0.2603413 b 70.88705

#check - formula
#H[j+1]=H[j]+T[j]*(S[j]-H[j])
# for j =2
# H[2] = H[1] + T[1] * (S[1] -H[1])

0 + 0.8550377 * (37.698250 - 0)
#> [1] 32.23342

#for j=7 (second row group b)

#H[6] + T[6] * (S[6] - H[6])
10.62499 + 0.7719067 * (49.648827 - 10.62499)
#> [1] 40.74775

Created on 2021-07-10 by the reprex package (v2.0.0)

Upvotes: 5

akrun
akrun

Reputation: 887251

Here is a similar approach as showed by @AnilGoyal for a generalized case

library(dplyr)
library(purrr)
df %>%
    mutate(H = accumulate2(S, T* !lead(!duplicated(g), default = FALSE),
          .init = 0, ~ ..1 + ..3 * (..2 - ..1))[-n()])

Upvotes: 5

Related Questions