Reputation: 101
I posted a question earlier on that topic, but I think it was not clear enough. Sorry. So, this is the second try.
I have data on the amount of milk consumed (volume) at different times for different individuals.
individual <- c(rep("A", 7), rep("B", 6))
time <- c(0, 12, 20, 26, 32, 36, 50, 0, 10, 21, 24, 36, 60)
volume <- c(0.3, 0.2, 0.1, 0.4, 0.3, 0.1, 0.2, 0.2, 0.4, 0.4, 0.3, 0.2, 0.1)
df <- data.frame(individual, time, volume)
So, I want to know how much milk is consumed during 24 hours after a milk ingestion. For example, individual A at time 0 h (first line in df) drank 0.3 L of milk and then drank an additionnal 0.2 L at time 12 and 0.1 L at time 20, which gives a total of 0.6 L drank during the 24 hours period following a milk ingestion.
I want to calculate this for every line for each individual and the desired output would be:
res_volume <- c(0.6, 1.1, 0.9, 1.0, "NA", "NA", "NA", 1.3, 1.1, 0.9, 0.5, 0.3, "NA")
df2 <- data.frame(df, res_volume)
"NA"s are there because there is not enough data to cover 24 hours after the milk ingestion (the difference in time between the last line for that individual and the given lines is less than 24 hours).
Any idea how I could achieve this? Your answers are really appreciated.
Upvotes: 0
Views: 452
Reputation: 19960
Does this function work for you? You can set the interval at whatever increment you like with the default at 24.
milk_iter_sum <- function(df, interval=24){
res_volume <- vector()
df_list <- split(df, f=individual)
for(i in 1:length(df_list)){
cur_df <- df_list[[i]]
for(j in 1:(nrow(cur_df))){
inner_cur_df <- cur_df[cur_df$time >= cur_df$time[j] & cur_df$time<=cur_df$time[j]+interval,]
if(cur_df$time[nrow(cur_df)] - inner_cur_df$time[1] < interval){
res_volume <- append(res_volume, NA)
}else{
res_volume <- append(res_volume, with(inner_cur_df, aggregate(volume, by = list(individual), sum))$x)
}
}
}
return(cbind(df, res_volume))
}
milk_iter_sum(df)
individual time volume res_volume
1 A 0 0.3 0.6
2 A 12 0.2 1.1
3 A 20 0.1 0.9
4 A 26 0.4 1.0
5 A 32 0.3 NA
6 A 36 0.1 NA
7 A 50 0.2 NA
8 B 0 0.2 1.3
9 B 10 0.4 1.1
10 B 21 0.4 0.9
11 B 24 0.3 0.5
12 B 36 0.2 0.3
13 B 60 0.1 NA
Upvotes: 1
Reputation: 21502
If I got your meaning, start by identifying the rows which follow a "long interval" :
therows<- which(df$interval>1)+1
Then
df[therows,c(1,2,4)]
should be your desired result
Upvotes: 0