ZKC
ZKC

Reputation: 5

How to take a 5-day average around a specific date in r

So i have a dataset that looks like this but without weekends:

 X1          X2
3798 2009-12-29           0
3799 2009-12-30           0
3800 2009-12-31           0 
3802 2010-01-02           0
3803 2010-01-03         2.1
3804 2010-01-04           0
3805 2010-01-05           0
3806 2010-01-06           0
3807 2010-01-07           0
3808 2010-01-08           0
3809 2010-01-09           0
3810 2010-01-10         6.8
3811 2010-01-12           0
3812 2010-01-13           0
3813 2010-01-14        17.7
3814 2010-01-16           0
3815 2010-01-17           0
3816 2010-01-18         1.5
3817 2010-01-19           0
3818 2010-01-20           0
3819 2010-01-21           0
3820 2010-01-22           0
3821 2010-01-23           0
3822 2010-01-24           0
3823 2010-01-25           0
3824 2010-01-26           0
3825 2010-01-27         4.5
3826 2010-01-28           0
3827 2010-01-29           0
3828 2010-01-31           0
3829 2010-02-01           0
3830 2010-02-03           0
3831 2010-02-04           0
3832 2010-02-05           0
3833 2010-02-07           0
3834 2010-02-08           0
3835 2010-02-09         1.2  

And i want to take a 5-day average around the 15th day of each month, and if the 15th happens on a weekend and doesn't exist in the dataset, i want to take a 5-day average around the closest date (14th or 16th), is that possible?

So this is the expected output

 X1          X2         5-day average
 1         2009-12-14           2
 2         2010-01-15           3 
 3         2010-02-15           4
 4         2010-03-16           2 
 5         2010-04-15           1
 6         2010-05-14           7

Upvotes: 0

Views: 94

Answers (1)

shadow
shadow

Reputation: 22293

It is pretty easy to take rolling averages with the rollapply function from zoo. Then you can just extract the ones you need (i.e. around the 15th of each month).

# packages used
require(data.table)
require(zoo)
# data preparation
df <- read.table(text=' X1          X2
                 3798 2009-12-29           0
                 3799 2009-12-30           0
                 3800 2009-12-31           0 
                 3802 2010-01-02           0
                 3803 2010-01-03         2.1
                 3804 2010-01-04           0
                 3805 2010-01-05           0
                 3806 2010-01-06           0
                 3807 2010-01-07           0
                 3808 2010-01-08           0
                 3809 2010-01-09           0
                 3810 2010-01-10         6.8
                 3811 2010-01-12           0
                 3812 2010-01-13           0
                 3813 2010-01-14        17.7
                 3814 2010-01-16           0
                 3815 2010-01-17           0
                 3816 2010-01-18         1.5
                 3817 2010-01-19           0
                 3818 2010-01-20           0
                 3819 2010-01-21           0
                 3820 2010-01-22           0
                 3821 2010-01-23           0
                 3822 2010-01-24           0
                 3823 2010-01-25           0
                 3824 2010-01-26           0
                 3825 2010-01-27         4.5
                 3826 2010-01-28           0
                 3827 2010-01-29           0
                 3828 2010-01-31           0
                 3829 2010-02-01           0
                 3830 2010-02-03           0
                 3831 2010-02-04           0
                 3832 2010-02-05           0
                 3833 2010-02-07           0
                 3834 2010-02-08           0
                 3835 2010-02-09         1.2', header=TRUE)
setDT(df)
df[, X1 <- as.Date(X1)]
setkey(df, X1)
# taking rolling averages
df[, rmean:=rollapply(X2, 5, mean, fill=NA)]
# extracting the rolling averages you need
dt <- df[, list(day15=abs(mday(X1)-15) == min(abs(mday(X1)-15)), X1, rmean), by=list(year(X1), month(X1))]
dt[day15==TRUE]
dt[day15==TRUE, .SD[1,] ,by=list(month, year)]

Upvotes: 1

Related Questions