Reputation: 135
I have a data.frame df that has monthly data:
Date Value
2008-01-01 3.5
2008-02-01 9.5
2008-03-01 0.1
I want there to be data on every day in the month (and I will assume Value does not change during each month) since I will be merging this into a different table that has monthly data.
I want the output to look like this:
Date Value
2008-01-02 3.5
2008-01-03 3.5
2008-01-04 3.5
2008-01-05 3.5
2008-01-06 3.5
2008-01-07 3.5
2008-01-08 3.5
2008-01-09 3.5
2008-01-10 3.5
2008-01-11 3.5
2008-01-12 3.5
2008-01-13 3.5
2008-01-14 3.5
2008-01-15 3.5
2008-01-16 3.5
2008-01-17 3.5
2008-01-18 3.5
2008-01-19 3.5
2008-01-20 3.5
2008-01-21 3.5
2008-01-22 3.5
2008-01-23 3.5
2008-01-24 3.5
2008-01-25 3.5
2008-01-26 3.5
2008-01-27 3.5
2008-01-28 3.5
2008-01-29 3.5
2008-01-30 3.5
2008-01-31 3.5
2008-02-01 9.5
I have tried to.daily
but my call:
df <- to.daily(df$Date)
returns
Error in to.period(x, "days", name = name, ...) : ‘x’ contains no data
Upvotes: 6
Views: 6764
Reputation: 2986
to.daily
can only be applied to xts/zoo
objects and can only convert to a LOWER frequency. i.e. from daily to monthly, but not the other way round.
One easy way to accomplish what you want is converting df
to an xts
object:
df.xts <- xts(df$Value,order.by = df$Date)
And merge, like so:
na.locf(merge(df.xts, foo=zoo(NA, order.by=seq(start(df.xts), end(df.xts),
"day",drop=F)))[, 1])
df.xts
2018-01-01 3.5
2018-01-02 3.5
2018-01-03 3.5
2018-01-04 3.5
2018-01-05 3.5
2018-01-06 3.5
2018-01-07 3.5
….
2018-01-27 3.5
2018-01-28 3.5
2018-01-29 3.5
2018-01-30 3.5
2018-01-31 3.5
2018-02-01 9.5
2018-02-02 9.5
2018-02-03 9.5
2018-02-04 9.5
2018-02-05 9.5
2018-02-06 9.5
2018-02-07 9.5
2018-02-08 9.5
….
2018-02-27 9.5
2018-02-28 9.5
2018-03-01 0.1
If you want to adjust the price continuously over the course of a month use na.spline
in place of na.locf
.
Upvotes: 1
Reputation: 20095
An option using tidyr::expand
expand a row between 1st day of month to last day of month. The lubridate::floor_date
can provide 1st day of month and lubridate::ceiling_date() - days(1)
will provide last day of month.
library(tidyverse)
library(lubridate)
df %>% mutate(Date = ymd(Date)) %>%
group_by(Date) %>%
expand(Date = seq(floor_date(Date, unit = "month"),
ceiling_date(Date, unit="month")-days(1), by="day"), Value) %>%
as.data.frame()
# Date Value
# 1 2008-01-01 3.5
# 2 2008-01-02 3.5
# 3 2008-01-03 3.5
# 4 2008-01-04 3.5
# 5 2008-01-05 3.5
#.....so on
# 32 2008-02-01 9.5
# 33 2008-02-02 9.5
# 34 2008-02-03 9.5
# 35 2008-02-04 9.5
# 36 2008-02-05 9.5
#.....so on
# 85 2008-03-25 0.1
# 86 2008-03-26 0.1
# 87 2008-03-27 0.1
# 88 2008-03-28 0.1
# 89 2008-03-29 0.1
# 90 2008-03-30 0.1
# 91 2008-03-31 0.1
Data:
df <- read.table(text =
"Date Value
2008-01-01 3.5
2008-02-01 9.5
2008-03-01 0.1",
header = TRUE, stringsAsFactors = FALSE)
Upvotes: 1
Reputation: 5456
Another way:
library(lubridate)
d <- read.table(text = "Date Value
2008-01-01 3.5
2008-02-01 9.5
2008-03-01 0.1",
stringsAsFactors = FALSE, header = TRUE)
Dates <- seq(from = min(as.Date(d$Date)),
to = ceiling_date(max(as.Date(d$Date)), "month") - days(1),
by = "1 days")
data.frame(Date = Dates,
Value = setNames(d$Value, d$Date)[format(Dates, format = "%Y-%m-01")])
Upvotes: 0
Reputation: 388982
Maybe not an efficient one but with base R we can do
do.call("rbind", lapply(1:nrow(df), function(i)
data.frame(Date = seq(df$Date[i],
(seq(df$Date[i],length=2,by="months") - 1)[2], by = "1 days"),
value = df$Value[i])))
We basically generate a sequence
of dates from start_date
to the last day of that month which is calculated by
seq(df$Date[i],length=2,by="months") - 1)[2]
and repeat the same value
for all the dates and put them in the data frame.
We get a list of dataframe and then we can rbind
them using do.call
.
Upvotes: 0
Reputation: 1223
Not sure if i understood perfectly but i think something like this may work.
First, i define the monthly data table
library(data.table)
DT_month=data.table(Date=as.Date(c("2008-01-01","2008-02-01","2008-03-01","2008-05-01","2008-07-01"))
,Value=c(3.5,9.5,0.1,5,8))
Then, you have to do the following
DT_month[,Month:=month(Date)]
DT_month[,Year:=year(Date)]
start_date=min(DT_month$Date)
end_date=max(DT_month$Date)
DT_daily=data.table(Date=seq.Date(start_date,end_date,by="day"))
DT_daily[,Month:=month(Date)]
DT_daily[,Year:=year(Date)]
DT_daily[,Value:=-100]
for( i in unique(DT_daily$Year)){
for( j in unique(DT_daily$Month)){
if(length(DT_month[Year==i & Month== j,Value])!=0){
DT_daily[Year==i & Month== j,Value:=DT_month[Year==i & Month== j,Value]]
}
}
}
Basically, the code will define the month and year of each monthly value in separate columns.
Then, it will create a vector of daily data using the minimum and maximum dates in your monthly data, and will create two separate columns for year and month for the daily data as well.
Finally, it goes through every combination of year and months of data filling the daily values with the monthly ones. In case there is no data for certain combination of month and year, it will show a -100.
Please let me know if it works.
Upvotes: 1