Calculate the sum of subsequent rows for a subset of values

Question

I posted a question earlier on that topic, but I think it was not clear enough. Sorry. So, this is the second try.

I have data on the amount of milk consumed (volume) at different times for different individuals.

individual <- c(rep("A", 7), rep("B", 6))
time <- c(0, 12, 20, 26, 32, 36, 50, 0, 10, 21, 24, 36, 60)
volume <- c(0.3, 0.2, 0.1, 0.4, 0.3, 0.1, 0.2, 0.2, 0.4, 0.4, 0.3, 0.2, 0.1)
df <- data.frame(individual, time, volume)

So, I want to know how much milk is consumed during 24 hours after a milk ingestion. For example, individual A at time 0 h (first line in df) drank 0.3 L of milk and then drank an additionnal 0.2 L at time 12 and 0.1 L at time 20, which gives a total of 0.6 L drank during the 24 hours period following a milk ingestion.

I want to calculate this for every line for each individual and the desired output would be:

res_volume <- c(0.6, 1.1, 0.9, 1.0, "NA", "NA", "NA", 1.3, 1.1, 0.9, 0.5, 0.3, "NA")
df2 <- data.frame(df, res_volume)

"NA"s are there because there is not enough data to cover 24 hours after the milk ingestion (the difference in time between the last line for that individual and the given lines is less than 24 hours).

Any idea how I could achieve this? Your answers are really appreciated.

cdeterman · Accepted Answer

Does this function work for you? You can set the interval at whatever increment you like with the default at 24.

milk_iter_sum <- function(df, interval=24){
  res_volume <- vector()
  df_list <- split(df, f=individual)
  for(i in 1:length(df_list)){
    cur_df <- df_list[[i]]
    for(j in 1:(nrow(cur_df))){

      inner_cur_df <- cur_df[cur_df$time >= cur_df$time[j] & cur_df$time<=cur_df$time[j]+interval,]

      if(cur_df$time[nrow(cur_df)] - inner_cur_df$time[1] < interval){
        res_volume <- append(res_volume, NA)
      }else{
        res_volume <- append(res_volume, with(inner_cur_df, aggregate(volume, by = list(individual), sum))$x)  

      }
    }
  }
  return(cbind(df, res_volume))
}

milk_iter_sum(df)

   individual time volume res_volume
1           A    0    0.3        0.6
2           A   12    0.2        1.1
3           A   20    0.1        0.9
4           A   26    0.4        1.0
5           A   32    0.3         NA
6           A   36    0.1         NA
7           A   50    0.2         NA
8           B    0    0.2        1.3
9           B   10    0.4        1.1
10          B   21    0.4        0.9
11          B   24    0.3        0.5
12          B   36    0.2        0.3
13          B   60    0.1         NA

Calculate the sum of subsequent rows for a subset of values

Answers (2)

Related Questions