Reputation: 67
I have data, which has multiple monthly variables. I would like to aggregate these variables to quarterly level. My initial data is:
Time A B C D . . . . . K
Jan-2004 42 57 53 28
Feb-2004 40 78 56 28
Mar-2004 68 77 53 20
Apr-2004 97 96 80 16
May-2004 84 93 76 17
Jun-2004 57 100 100 21
Jul-2004 62 100 79 22
.
.
.
.
N
So the goal is calculate quarters as monthly averages (sum(jan+feb+mar)/3)). In other words, the goal is to end up data like this:
Time A B C D . . . . . K
2004Q1 50,0 70,7 54,0 25,3
2004Q2 79,3 96,3 85,3 18,0
2004Q3
.
.
.
N
Could anyone help me with this problem?
Thank you very much.
Upvotes: 1
Views: 3041
Reputation: 34703
data.table
has quarter
function, you can do:
library(data.table)
setDT(my_data)
my_data[ , lapply(.SD, mean), by = .(year = year(Time), quarter = quarter(Time))]
This is the gist of it. Getting it to work exactly would require a reproducible example.
Upvotes: 1
Reputation: 886938
An option would be to convert the 'Time' to yearqtr
class with as.yearqtr
from zoo
and do a summarise_all
library(zoo)
library(dplyr)
df1 %>%
group_by(Time = format(as.yearqtr(Time, "%b-%Y"), "%YQ%q")) %>%
summarise_all(mean)
# A tibble: 3 x 5
# Time A B C D
# <chr> <dbl> <dbl> <dbl> <dbl>
#1 2004Q1 50 70.7 54 25.3
#2 2004Q2 79.3 96.3 85.3 18
#3 2004Q3 62 100 79 22
df1 <- structure(list(Time = c("Jan-2004", "Feb-2004", "Mar-2004", "Apr-2004",
"May-2004", "Jun-2004", "Jul-2004"), A = c(42L, 40L, 68L, 97L,
84L, 57L, 62L), B = c(57L, 78L, 77L, 96L, 93L, 100L, 100L), C = c(53L,
56L, 53L, 80L, 76L, 100L, 79L), D = c(28L, 28L, 20L, 16L, 17L,
21L, 22L)), class = "data.frame", row.names = c(NA, -7L))
Upvotes: 1