Reputation: 1947
I have a little problem with using aggregate()
function to convert monthly data to quarterly data.
I searched a lot in this topic and there are a lot of examples how to use aggregate()
function combined with ts()
function. But is there any possibility that it would work for simple data frame? I mean let's take the code following :
date<-(seq(as.Date('2000-01-01'),as.Date('2020-04-01'),by = '1 month'))
x<-rnorm(244)
df1<-data.frame(date,x)
aggregate(df1,nfrequency=4,FUN=mean)
Error in aggregate.data.frame(df1, nfrequency = 4, FUN = mean) : argument "by" is missing, with no default
Can you help me with that?
UPDATE
I used the code proposed by Ronak.
The problem I'm dealing now is the following :
set.seed(20)
date<-(seq(as.Date('2000-01-01'),as.Date('2020-04-01'),by = '1 month'))
x<-rnorm(244)
df1<-data.frame(date,x)
unlist((df1 %>% group_by(quarter = quarters(date), year = lubridate::year(date)) %>% summarise(x = mean(x)) %>% arrange(year))[,3])[1]
0.7874086
mean(x[1],x[2],x[3])
1.162685
The means differ. Do you have any idea why?
Upvotes: 1
Views: 4734
Reputation: 1
In my case quarters
function arrange my dataset in weird way. So I make another solution, maybe it would be helpful
df <- df %>%
mutate(quarter = paste(quarters(Date), year(Date)), year = year(Date)) %>%
group_by(quarter, year, .add = T) %>%
summarise(x = mean(x) %>%
arrange(year)
Thanks to this dataset is arranged by year and quarter so it goes ike Q1 2010, Q2 2010, Q3 2010 and so on
Upvotes: 0
Reputation: 388817
You can extract quarter and year information from Date
and aggregate
:
df2 <- aggregate(x~quarter + year, transform(df1,
quarter = quarters(date), year = as.integer(format(date, '%Y'))), mean)
df2 <- df2[order(df2$year), ]
Or using dplyr
:
library(dplyr)
df1 %>%
group_by(quarter = paste(quarters(date), lubridate::year(date))) %>%
summarise(x = mean(x))
# quarter x
# <chr> <dbl>
# 1 Q1 2000 0.347
# 2 Q1 2001 -0.592
# 3 Q1 2002 0.802
# 4 Q1 2003 0.237
# 5 Q1 2004 -0.00882
# 6 Q1 2005 0.0535
# 7 Q1 2006 0.218
# 8 Q1 2007 0.177
# 9 Q1 2008 -0.258
#10 Q1 2009 0.246
# … with 72 more rows
Upvotes: 4
Reputation: 1304
assuming you want to aggregate the years it would look something like this:
date<-(seq(as.Date('2000-01-01'),as.Date('2020-04-01'),by = '1 month'))
x<-rnorm(244)
df1<-data.frame(date,x)
head(df1)
y<-aggregate(list(mean_x=df1$x), #gives a name to the aggregated column
by=list(y=substring(df1$date, 1,4)), # "by" has to be a list argument including the column you want to use as aggregating variable (here aggregating over the year)
FUN=mean)
y
Upvotes: 0