Reputation: 395
I have data with a date column:
library(data.table)
library(lubridate)
Y = data.table(X = c("2012-12-31", "2021-10-31"))
X
1: 2012-12-31
2: 2021-10-31
I want to add a sequence of months, from 1 to x
, to the dates in column "X". The result should be a set of new columns, "X_1", "X_2", ... "X_x":
X X_1 X_2 X_x
1: 2012-12-31 2013-01-31 2013-02-28 .. (2012-12-31) %m+% months(x)
2: 2021-10-31 2021-11-30 2021-12-31 .. (2021-10-31) %m+% months(x)
I've tried several things that didn't work, such as:
Y[ , c(paste0("X_", 1:15))] = AddMonths(Y$X, c(1:15)) %>% LastDayInMonth()
Y[ , c(paste0("X_", 1:15)) := AddMonths(as.IDate(X), c(1:15)) %>% LastDayInMonth()]
I want to create 15 columns based on this operation, so I want to avoid for
loops. My real data have 80K rows.
Upvotes: 2
Views: 722
Reputation: 67778
Both %m+%
and months
are vectorized.
library(data.table)
library(lubridate)
n = 3
nm = paste0("x", "_", seq(n))
m = rep(seq(n), each = nrow(d))
d[ , (nm) := split(x %m+% months(m), m)]
d
# x x_1 x_2 x_3
# 1: 2012-12-31 2013-01-31 2013-02-28 2013-03-31
# 2: 2021-10-31 2021-11-30 2021-12-31 2022-01-31
Explanation:
Set the maximum number of months in the sequence to be added to each value in the date column (e.g. n = 3
). Create a vector of column names to which the result should be assigned (nm = paste0("x", "_", seq(n))
). Create a vector of months to be added (m = rep(seq(n), each = nrow(d))
) to the date column. Add the months in 'm' to the dates in 'x' (x %m+% months(m)
). Split the result by the number of months added (split(x %m+% months(m), m)
) to a list. Use the fundamental property of data.table
, "each element of the list becomes a column in the resulting data.table
", and assign the resulting columns by reference (:=
).
d = data.table(x = as.Date(c("2012-12-31","2021-10-31")))
Upvotes: 2
Reputation: 78917
Update: I pasted a wrong version of the answer: months(1)
should be months(i-1)
. Corrected now.
In a for loop we can do it like this:
library(lubridate) #%m+%
Y=data.table("X"=as.Date(c("2012-12-31","2021-10-31")))
for (i in 2:15) {
Y <- Y %>%
mutate(col= X %m+% months(i-1))
colnames(Y)[i] <- paste0("X_",i-1)
}
X X_1 X_2 X_3 X_4 X_5 X_6 X_7 X_8 X_9 X_10 X_11 X_12
1: 2012-12-31 2013-01-31 2013-02-28 2013-03-31 2013-04-30 2013-05-31 2013-06-30 2013-07-31 2013-08-31 2013-09-30 2013-10-31 2013-11-30 2013-12-31
2: 2021-10-31 2021-11-30 2021-12-31 2022-01-31 2022-02-28 2022-03-31 2022-04-30 2022-05-31 2022-06-30 2022-07-31 2022-08-31 2022-09-30 2022-10-31
X_13 X_14
1: 2014-01-31 2014-02-28
2: 2022-11-30 2022-12-31
Upvotes: 0
Reputation: 3447
Please check this code
data.table("X"=c("2012-12-31","2021-10-31")) %>% mutate(date=as.Date(X), seq=row_number(), date2=date %m+% months(15)) %>%
group_by(seq) %>%
pivot_longer(starts_with('date'), names_to = 'name', values_to = 'date') %>%
tidyr::complete(date=full_seq(date, period = 1)) %>% fill(X) %>%
mutate(year=year(date), month=month(date)) %>% group_by(X,year, month) %>%
slice_tail(n=1) %>% ungroup %>% group_by(X) %>%
mutate(row=row_number()) %>%
pivot_wider(X,names_from = row, names_prefix = 'X',values_from = date)
Created on 2023-01-28 with reprex v2.0.2
# A tibble: 2 × 17
# Groups: X [2]
X X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11
<chr> <date> <date> <date> <date> <date> <date> <date> <date> <date> <date> <date>
1 2012-12-31 2012-12-31 2013-01-31 2013-02-28 2013-03-31 2013-04-30 2013-05-31 2013-06-30 2013-07-31 2013-08-31 2013-09-30 2013-10-31
2 2021-10-31 2021-10-31 2021-11-30 2021-12-31 2022-01-31 2022-02-28 2022-03-31 2022-04-30 2022-05-31 2022-06-30 2022-07-31 2022-08-31
# … with 5 more variables: X12 <date>, X13 <date>, X14 <date>, X15 <date>, X16 <date>
# ℹ Use `colnames()` to see all variable names
Upvotes: 0
Reputation: 4524
As the question is tagged with dplyr
I assume tidyverse
is allowed, too.
Starting with a list for the 2 dates you have.
A function taking a date
as input adds from starting month as many months as choosen in the additional argument.
The function is then applied with map
to the list of initial dates. This is then transformed into a data.frame
.
n=6
library(tidyverse)
library(lubridate)
f <- function(x, n = 5) {
yy <- x
for (i in seq(n)) {
yy <- append(yy, x %m+% months(i))
}
return(yy)
}
l <- list(("2012-12-31"), ("2021-10-31"))
l |>
map(lubridate::as_date) |>
map(n = 6, f) |>
as.data.frame() |>
setNames(l) |>
t() |> as.data.frame()
#> V1 V2 V3 V4 V5 V6
#> 2012-12-31 2012-12-31 2013-01-31 2013-02-28 2013-03-31 2013-04-30 2013-05-31
#> 2021-10-31 2021-10-31 2021-11-30 2021-12-31 2022-01-31 2022-02-28 2022-03-31
#> V7
#> 2012-12-31 2013-06-30
#> 2021-10-31 2022-04-30
Upvotes: 0