How to extract subsets with overlaps from data frame in R

Question

can anyone suggest a function for extracting the subsets from data frame? More specific: Let's say I have a data frame with 1000 rows. I want to create a data "window" containing 10 rows and calculating the standard deviation of the current "window" (subset) and move it by 5 rows further and doing it for the next "window". So, I don't want to skip any row, instead of it I want to have overlap of 5 rows between the "windows". Thanks!

Jilber Urbina · Accepted Answer

You are looking for rollmean from zoo package:

Example

> library(zoo) 
> x.Date <- as.Date(paste(2004, rep(1:4, 4:1), sample(1:28, 10), sep = "-"))
> set.seed(1)
> x<- zoo(rnorm(12), x.Date) # Creating a time series
> rollmean(x, 5) # obtaining the 5 days rolling mean.
2004-01-10 2004-01-11 2004-02-21 2004-02-27 2004-02-28 2004-03-13 
 0.1292699  0.3938814  0.3550785  0.1836873  0.2621149  0.1351357

In this example the moving windows is 5 and the "overlap" length is 1.

Take a look at ?rollmean and also ?rollapply can be helpful.

> rollapply(x, width=5, by=2, mean)
2004-01-10 2004-02-21 2004-02-28 
 0.1292699  0.3550785  0.2621149

Using rollapply allows you to vary the "overlaping" length through by argument. Note in this case the moveing windows is 5 while the "overlap" length is 2.

How to extract subsets with overlaps from data frame in R

Answers (1)

Related Questions