Reputation: 75
My dat looks like below with different node_desc having weekly data for 4 years
ID1 ID2 DATE_ value
1: 00001 436 2014-06-29 175.8164
2: 00001 436 2014-07-06 188.9264
3: 00001 436 2014-07-13 167.5376
4: 00001 436 2014-07-20 160.7907
5: 00001 436 2014-07-27 185.3018
6: 00001 436 2014-08-03 179.5748
would like to convert data frame to monthly.Trying below code
df %>%
tq_transmute(select = c(value,ID1),
mutate_fun = apply.monthly,
FUN = mean)
But my output looks like below
DATE_ value
<dttm> <dbl>
1 2014-06-29 00:00:00 144.
2 2014-07-27 00:00:00 143.
3 2014-08-31 00:00:00 143.
4 2014-09-28 00:00:00 152.
5 2014-10-26 00:00:00 156.
6 2014-11-30 00:00:00 166.
But I would like to have ID1,ID2,Date(monthly) and value(either getting the mean or max of 4 weeks) instead of just having date and value,because I have data of different ID1's for 4 years.Can someone help me in R
Upvotes: 0
Views: 1340
Reputation: 23608
The problem you have is that your dataset doesn't have daily data. The apply.monthly
function comes from xts, but tidyquant uses wrappers around a lot of functions so they work in a more tidy way. apply.monthly
needs an xts object, which is basicly a matrix with a time index.
Also know that apply.monthly returns the last available day of the month in your timeseries. Looking at your example set, the last day it returns for july 2017 will the 27th. Now if you have 5 records (weeks) in a month the mean function will do this over 5 records. It will never be exactly 1 month as weekly data never covers monthly data.
But with tidyquant you can get sort of a monthly result with ID1 and ID2 with your data if you join the outcome with the original data. See code below. I haven't removed any unwanted columns.
df1 %>%
tq_transmute(select = c(value, ID1),
mutate_fun = apply.monthly,
FUN = mean) %>%
mutate(DATE_ = as.Date(DATE_)) %>%
inner_join(df1, by = "DATE_")
# A tibble: 3 x 5
DATE_ value.x ID1 ID2 value.y
<date> <dbl> <fct> <fct> <dbl>
1 2014-06-29 176. 00001 436 176.
2 2014-07-27 176. 00001 436 185.
3 2014-08-03 180. 00001 436 180.
data:
df1 <- data.frame(ID1 = rep("00001", 6),
ID2 = rep("436", 6),
DATE_ = as.Date(c("2014-06-29", "2014-07-06", "2014-07-13", "2014-07-20", "2014-07-27", "2014-08-03")),
value = c(175.8164,188.9264,167.5376,160.7907,185.3018,179.5748)
)
Upvotes: 0
Reputation: 171
Here's my take
dta <- data.frame(id1=rep("00001",6),id2=rep("436",6),
date_=as.Date(c("29jun2014","6jul2014","13jul2014","20jul2014","27jul2014","3aug2014"),"%d%B%Y"),
value=c(175.8164,188.9264,167.5376,160.7907,185.3018,179.5748))
And dplyr
would do the rest. Here I summarize the data by taking the mean value
library(dplyr)
my_dta <- dta %>% mutate(month_=format(as.yearmon(date_),"%b"))
my_dta %>% group_by(.dots=c("id1","id2")) %>% summarise(mvalue=mean(value))
Upvotes: 1