Greg Martin
Greg Martin

Reputation: 343

Aggregate data in an xts object by time and second (factor) variable

I'm new to time series analysis and using xts (and R in general) so please forgive the basic nature of the question.

I'm wanting to aggregate the data by a time frame (e.g. months) and by a second factor variable. To illustrate my question, please see the following:

require(xts)

# Create example df and convert it to an xts object 

date <- sample(seq(as.Date("2015/01/01"), as.Date("2016/12/31"), by="day"),12)
colour <- c("Red", "Red", "Blue", "Blue", "Blue", "Blue", "Red", "Red", "Red", 
            "Red", "Blue", "Blue")
value <- sample(1:10, 12, replace = TRUE)
df <- cbind.data.frame(date, colour, value)
df <- xts(df[,-1], order.by = df$date)

This creates a sample dataframe that looks like this:

         colour value
2015-01-30 "Blue" "2"  
2015-03-15 "Blue" "9"  
2015-03-22 "Blue" "9"  
2015-08-13 "Blue" "5"  
2015-09-01 "Blue" "8"  
2015-11-10 "Red"  "7"  
2016-04-26 "Blue" "2"  
2016-07-06 "Red"  "9"  
2016-07-07 "Red"  "6"  
2016-07-08 "Red"  "2"  
2016-10-01 "Red"  "6"  
2016-11-07 "Red"  "2"  

I can summarise the "value" variable using:

apply.monthly(df$value, FUN = mean)

To give me:

             value
2015-01-30 2.000000
2015-03-22 9.000000
2015-08-13 5.000000
2015-09-01 8.000000
2015-11-10 7.000000
2016-04-26 2.000000
2016-07-08 5.666667
2016-10-01 6.000000
2016-11-07 2.000000

But I can't quite see how to aggregate by (in this case) the colour variable (I'd like the sum of each colour by month). Any help would be greatly appreciated.

Upvotes: 2

Views: 1588

Answers (2)

Raad
Raad

Reputation: 2715

How about this?

aggregate(as.numeric(df$value),
          list(Month = format(index(df), "%Y-%m"), 
               Colour = df$colour), 
          mean)

In response to your comment below:

# You can replace the format with the following to get a year month object
zoo::as.yearmon(index(df))

# Or you can covert to date by using the first of every month
as.Date(paste(format(index(df), "%Y-%m"), "-01", sep = ""))

You might find more ideas here: Converting year and month ("yyyy-mm" format) to a date in R?

Upvotes: 2

FXQuantTrader
FXQuantTrader

Reputation: 6891

If you want to work with xts objects after subsetting by colour, it's easy to work with each time series (by colour) separately in a list like this:

df <- cbind.data.frame(date, colour, value)

> class(df)
#[1] "data.frame"

# data.frame split (not xts split) to separate data by colour in a list object:
l_out <- split(df, colour)

> class(l_out[[1]])
[1] "data.frame"

mthly_mean <- function(x) {
    apply.monthly(as.xts(x[, "value"], x[, "date"]), mean)    
}

# Each element in the list is an xts object (not a data.frame) containing the mean of the data for each month:
l_res <- lapply(l_out, FUN = mthly_mean)
# or more succinctly:
# l_res <- lapply(l_out, FUN = function(x) apply.monthly(as.xts(x[, "value"], x[, "date"]), mean))

> l_res
# $Blue
# [,1]
# 2015-01-15  8.0
# 2015-07-21  4.5
# 2016-01-28  5.0
# 2016-04-28  4.0
# 2016-05-08  2.0
# 
# $Red
# [,1]
# 2015-11-30    3
# 2016-01-18    7
# 2016-02-25    5
# 2016-04-17    1
# 2016-05-23    6
# 2016-07-14    5

> class(l_res[[1]])
[1] "xts" "zoo"

Upvotes: 1

Related Questions