Reputation: 2263
The concentration of germs of hands following j surface contacts can be dictated by the following recursive relationship:
H[j+1]=H[j]+T[j]*(S[j]-H[j])
Where S
is the surface concentration the hand touches (and is assumed random for ease). T
is the transfer efficiency for each contact. I would like to calculate the eventual hand concentration (with zero starting concentration).
I have a data frame that has a vector of surface contacts and transfer efficiencies for each surface. I have two groups a
& b
and within each group assume I will touch each one sequentially 1:length(df)
:
df <- data.frame(S = runif(10)*100, T = runif(10),g=rep(c("a","b"),each=5))
I would like to compute the cumulative sum of H
by group using dplyr
where possible.
a special case:
If g = "a"
, the starting value of H
is 0
.
If g=="b"
then the starting value of H
is the last value from when g=="a"
Upvotes: 3
Views: 378
Reputation: 21928
Here is another generalized version I would use for this question:
df$H <- Reduce(function(x, y) {
x + df$T[y] * (df$g[y] == df$g[y + 1]) * (df$S[y] - x)
}, init = 0,
seq_len(nrow(df))[-nrow(df)], accumulate = TRUE)
df
S T g H
1 37.698250 0.8550377 a 0.00000
2 3.843585 0.4722659 a 32.23342
3 33.150788 0.3684791 a 18.82587
4 8.948116 0.8893603 a 24.10430
5 57.061844 0.5452377 a 10.62499
6 49.648827 0.7719067 b 10.62499
7 95.403697 0.5835950 b 40.74775
8 10.598677 0.1220491 b 72.64469
9 91.913365 0.2166443 b 65.07203
10 69.644200 0.2603413 b 70.88705
Upvotes: 3
Reputation: 26218
For the sake of completeness and taking clues from Arun and Onyambu (on a separate question), I am adding baseR answer here too.
transform(df, H = Reduce(function(.x, .y) .x + df$T[.y] * (df$S[.y] - .x) * !c(!duplicated(df$g)[-1], 0)[.y],
seq(nrow(df)),
init = 0,
accumulate = TRUE)[-(1 + nrow(df))])
S T g H
1 37.698250 0.8550377 a 0.00000
2 3.843585 0.4722659 a 32.23342
3 33.150788 0.3684791 a 18.82587
4 8.948116 0.8893603 a 24.10430
5 57.061844 0.5452377 a 10.62499
6 49.648827 0.7719067 b 10.62499
7 95.403697 0.5835950 b 40.74775
8 10.598677 0.1220491 b 72.64469
9 91.913365 0.2166443 b 65.07203
10 69.644200 0.2603413 b 70.88705
Earlier Answer
A slight variation of my friend's answer above, I hope that may serve your purpose. Only assumption I am having is that your data is sorted by groups already and a
precedes b
(exactly as shown in sample). Since you have not given the random seed, I am also taking the same data took by my friend.
0
value of T
inside accumulate2
argument so that last value of H
in group a
is repeated in first value of group b
library(tidyverse)
df <- read.table(header = TRUE, text = ' S T g
1 37.698250 0.8550377 a
2 3.843585 0.4722659 a
3 33.150788 0.3684791 a
4 8.948116 0.8893603 a
5 57.061844 0.5452377 a
6 49.648827 0.7719067 b
7 95.403697 0.5835950 b
8 10.598677 0.1220491 b
9 91.913365 0.2166443 b
10 69.644200 0.2603413 b')
df %>%
mutate(H = accumulate2(S, replace(T, length(g[g=='a']), 0), .init = 0, ~ ..1 + ..3 * (..2 - ..1))[-(1+n())])
S T g H
1 37.698250 0.8550377 a 0.00000
2 3.843585 0.4722659 a 32.23342
3 33.150788 0.3684791 a 18.82587
4 8.948116 0.8893603 a 24.10430
5 57.061844 0.5452377 a 10.62499
6 49.648827 0.7719067 b 10.62499
7 95.403697 0.5835950 b 40.74775
8 10.598677 0.1220491 b 72.64469
9 91.913365 0.2166443 b 65.07203
10 69.644200 0.2603413 b 70.88705
#check - formula
#H[j+1]=H[j]+T[j]*(S[j]-H[j])
# for j =2
# H[2] = H[1] + T[1] * (S[1] -H[1])
0 + 0.8550377 * (37.698250 - 0)
#> [1] 32.23342
#for j=7 (second row group b)
#H[6] + T[6] * (S[6] - H[6])
10.62499 + 0.7719067 * (49.648827 - 10.62499)
#> [1] 40.74775
Created on 2021-07-10 by the reprex package (v2.0.0)
Upvotes: 5
Reputation: 887251
Here is a similar approach as showed by @AnilGoyal for a generalized case
library(dplyr)
library(purrr)
df %>%
mutate(H = accumulate2(S, T* !lead(!duplicated(g), default = FALSE),
.init = 0, ~ ..1 + ..3 * (..2 - ..1))[-n()])
Upvotes: 5