Lila
Lila

Reputation: 39

Averaging data over a week, with a week having between 0 and 2 values

I have a "weekly" xts object. The data is referring to futures (fronth month futures). The weeks were calculated as follows:

  1. week 1 ends on the last business day of the previous calendar month
  2. week 2 ends on the 5th business day of the current calendar month
  3. week 3 ends on the 10th business day of the current calendar month
  4. week 4 ends on the day when the near contract expires

The data looks as follows (part of the data as example):

2005-09-30  0.0019094400
2005-10-07  0.0020219110
2005-10-14  0.0067063990
2005-10-20  0.0074893360
2005-10-31  0.0028761630
2005-11-07  0.0011331470
2005-11-14  0.0031749880
2005-11-18  0.0007342980
2005-11-30  0.0025730810
2005-12-07 -0.0003133450
2005-12-14 -0.0008288860
2005-12-20  0.0013468400
2005-12-30  0.0012742930
2006-01-09 -0.0007873670
2006-01-17 -0.0004193150
2006-01-20 -0.0005391370
2006-01-31 -0.0022229660

If I call "string" on my dataset, I get the following (you can ignore X here, the important data is Date and Risk.Premium):

   'data.frame':    484 obs. of  3 variables:
 $ Date        : num  NA NA NA NA NA NA NA NA NA NA ...
 $ Risk.Premium: num  0.00191 0.00202 0.00671 0.00749 0.00288 ...
 $ X           : logi  NA NA NA NA NA NA ...

As you can see, there is between 0 and 2 values for one week in the data. I want to transform the data into "proper" weekly data, so I can compare it with other weekly data (i.e. if a week has 2 values, take the average, if a week has zero values, drop this week). Does anyone have an idea how to do this?

Upvotes: 0

Views: 59

Answers (1)

Julian Zucker
Julian Zucker

Reputation: 564

Creating sample data:

mydf <- data.frame(
  date  = c("2005-09-30", "2005-10-07", "2005-10-08", "2005-11-12"),
  value = c(1, 2, 3, 4))

Then create a function that, given a week, finds the average. It will return NaN if given a week that doesn't have any values.

weekAverage <- function(week) {
  vals <- mydf[which(format(as.Date(mydf$date), "%W") == week), 2]
  mean(vals)
}

Then apply the function to each unique week in the data frame.

weeks <- unique(format(as.Date(mydf$date), "%W"))
weeklyAverages <- data.frame(
  Week = weeks,
  Average = sapply(weeks, weekAverage))

weeklyAverages
##   Week Average
##39   39     1.0
##40   40     2.5
##45   45     4.0

This works if all your weeks are within the same year, because it only factors in the week of the year, not the year. If you want to include year, you can change the format string to "%W %Y" or similar.

Upvotes: 1

Related Questions