Reputation: 21
I have a starting time specified as a year-month
character, e.g. "2020-12". From the start, for each of T
consecutive months, I need to generate n
different dates (year-month-day), where the day is random.
Any help will be useful!
The data I'm working on:
data <- data.frame(
data = sample(seq(as.Date('2000/01/01'), as.Date('2020/01/01'), by="day"), 500),
price = round(runif(500, min = 10, max = 20),2),
quantity = round(rnorm(500,30),0)
)
Upvotes: 2
Views: 182
Reputation: 67778
You could convert the start time to a class for monthly data, zoo::yearmon
. Then use as.Date.yearmon
and its frac
argument ("a number between 0 and 1 inclusive that indicates the fraction of the way through the period that the result represents") with random values from runif
(uniform between 0 and 1) to convert to a random date within each year-month.
start = "2020-12"
T = 3
n = 2
library(zoo)
set.seed(1)
as.Date(as.yearmon(start) + rep((1:T)/12, each = n), frac = runif(T * n))
# [1] "2021-01-08" "2021-01-12" "2021-02-16" "2021-02-25" "2021-03-07" "2021-03-27"
Upvotes: 2
Reputation: 160407
func <- function(start, months, n) {
startdate <- as.Date(paste0(start, "-01"))
enddate <- seq(startdate, by = "month", length.out = months)
months <- seq_len(months)
enddate_lt <- as.POSIXlt(enddate)
enddate_lt$mon <- enddate_lt$mon + 1
enddate_lt$mday <- enddate_lt$mday - 1
days_per_month <- as.integer(format(enddate_lt, format = "%d"))
days <- lapply(days_per_month, sample, size = n)
dates <- Map(`+`, enddate, days)
do.call(c, dates)
}
set.seed(2021)
func("2020-12", 4, 3)
# [1] "2020-12-08" "2020-12-07" "2020-12-15" "2021-01-27" "2021-01-08" "2021-01-13" "2021-02-21" "2021-02-07" "2021-02-28"
# [10] "2021-03-28" "2021-03-07" "2021-03-15"
func("2020-12", 5, 2)
# [1] "2020-12-06" "2020-12-16" "2021-01-08" "2021-01-10" "2021-02-24" "2021-02-13" "2021-03-20" "2021-03-29" "2021-04-19"
# [10] "2021-04-28"
func("2020-12", 2, 10)
# [1] "2020-12-29" "2020-12-30" "2020-12-04" "2020-12-15" "2020-12-09" "2020-12-27" "2020-12-05" "2020-12-06" "2020-12-23"
# [10] "2020-12-17" "2021-01-03" "2021-01-20" "2021-01-05" "2021-01-22" "2021-01-23" "2021-01-06" "2021-01-10" "2021-01-07"
# [19] "2021-01-19" "2021-01-12"
Most of the dancing with POSIXlt
objects is because it gives us clean (base R) access to the number of days in a month, which makes sample
ing the days in a month rather simple. It can also be done (code-golf shorter) using the lubridate
package, but I don't know that that is any more correct than this code is.
This just dumps out a sequence of random dates, with n
days per month. It does not sort within each month, though it does output the months in order. (That's not a difficult extension, there just wasn't a requirement for it.) It doesn't put out a frame, you can easily extend this to fit in a frame or call data.frame(date = do.call(c, dates))
on the last line, depending on what you need to do with the output.
Upvotes: 2