Moving average with grouped data

Question

I would like to calculate the Moving Average of my dataset that is composed by a column representing an index of grouped patients and a second column representing some measurements of a blood circulating molecule. Patients are grouped according to consecutive measurements of the molecule of interest.

Moreover I would like to plot the output in which the measurement per group is plotted against the patients group number.

I tried to code this analysis but I'm not sure I'm doing well.

  SURG_DATE VES_2A Index
 21/05/2013    1     1
 10/06/2013    1     1
 06/01/2014    1     1
 29/01/2014    0     1
 11/03/2014    3     2
 05/04/2014    1     2
 06/04/2014    1     2
 14/05/2014    1     2
 28/05/2014    3     3
 02/09/2014    2     3
 16/09/2014    2     3
 17/09/2014    0     3
 21/10/2014    2     5
 05/12/2014    0     5
 19/12/2014    2     5
 11/01/2015    1     5
 15/01/2015    1     6
 17/01/2015    2     6
 24/01/2015    1     6
 19/02/2015    1     6

The code I tried:

tapply(test$VES_2A, 
       test$Index, 
       function(x) rollmean(x, 12, na.pad=TRUE))

jay.sf · Accepted Answer

It's a little ambiguous, but I think you want this:

test <- cbind(time=rownames(test), test)  # first add a time variable

# then create a list with rolling mean for each id and time
ls1 <- lapply(seq_along(test$time), 
              function(x) cbind(time=x,  # time variable
                                with(test[test$time %in% 1:x, ], 
                                     aggregate(list(VES_2A=VES_2A), 
                                               list(Index=Index), mean))  # rolling mean
                                ))

tot <- transform(t(sapply(ls1, colMeans)), Index="total")  # occasionally add a total column

long <- rbind(do.call(rbind, ls1), tot)  # bind all rows together into long format data frame
wide <- reshape2::dcast(long, time ~ Index)  # reshape to wide w/ e.g. reshape2::dcast()
rm(ls1, tot)  # clean up

Yielding

> wide
   time    1        2        3        5        6    total
1     1 1.00       NA       NA       NA       NA 1.000000
2     2 1.00       NA       NA       NA       NA 1.000000
3     3 1.00       NA       NA       NA       NA 1.000000
4     4 0.75       NA       NA       NA       NA 0.750000
5     5 0.75 3.000000       NA       NA       NA 1.875000
6     6 0.75 2.000000       NA       NA       NA 1.375000
7     7 0.75 1.666667       NA       NA       NA 1.208333
8     8 0.75 1.500000       NA       NA       NA 1.125000
9     9 0.75 1.500000 3.000000       NA       NA 1.750000
10   10 0.75 1.500000 2.500000       NA       NA 1.583333
11   11 0.75 1.500000 2.333333       NA       NA 1.527778
12   12 0.75 1.500000 1.750000       NA       NA 1.333333
13   13 0.75 1.500000 1.750000 2.000000       NA 1.500000
14   14 0.75 1.500000 1.750000 1.000000       NA 1.250000
15   15 0.75 1.500000 1.750000 1.333333       NA 1.333333
16   16 0.75 1.500000 1.750000 1.250000       NA 1.312500
17   17 0.75 1.500000 1.750000 1.250000 1.000000 1.250000
18   18 0.75 1.500000 1.750000 1.250000 1.500000 1.350000
19   19 0.75 1.500000 1.750000 1.250000 1.333333 1.316667
20   20 0.75 1.500000 1.750000 1.250000 1.250000 1.300000

Plot

library(ggplot2)
ggplot(long, aes(time, VES_2A, color=Index)) +
  geom_line()

Tell me what you think, hope that's what you've wanted.

Data

test <- structure(list(VES_2A = c(1L, 1L, 1L, 0L, 3L, 1L, 1L, 1L, 3L, 
                                  2L, 2L, 0L, 2L, 0L, 2L, 1L, 1L, 2L, 1L, 1L), Index = c(1L, 1L, 
                                                                                         1L, 1L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 5L, 5L, 5L, 5L, 6L, 6L, 
                                                                                         6L, 6L)), class = "data.frame", row.names = c(NA, -20L))

Moving average with grouped data

Answers (2)

Note

Related Questions