Reputation: 655
my dataframe looks someway like this
df <- read.table(text="
id start end
1 2 2018-10-01 2018-12-01
2 3 2018-01-01 2018-04-01
", header=TRUE)
What I trying to achieve is get difference between start and day in months for each id and then generate new dataframe with each month for particular id. Result should be
result <- read.table(text="
id date
1 2 2018-10-01
2 2 2018-11-01
3 2 2018-12-01
4 3 2018-01-01
5 3 2018-02-01
6 3 2018-03-01
7 3 2018-04-01
", header=TRUE)
Upvotes: 3
Views: 1591
Reputation: 886948
We can do this easily with Map
. Pass the Date
converted 'start' and 'end' columnd of the dataset as arguments to Map
, get the sequence of 'month's as a list
and expand the 'id' based on the lengths
of list
as well as concatenate the list
elements to create the expanded dataframe
lst1 <- Map(seq, MoreArgs = list(by = 'month'), as.Date(df$start), as.Date(df$end))
data.frame(id = rep(df$id, lengths(lst1)), date = do.call(c, lst1))
# id date
#1 2 2018-10-01
#2 2 2018-11-01
#3 2 2018-12-01
#4 3 2018-01-01
#5 3 2018-02-01
#6 3 2018-03-01
#7 3 2018-04-01
Or using tidyverse
, we mutate
the class
of the 'start', 'end' columns to Date
, using map2
(from purrr
), get the seq
uence of dates from 'start' to 'end' by
the 'month' and expand the data by unnest
ing the dataset
library(tidyverse)
df %>%
mutate_at(2:3, as.Date) %>%
transmute(id = id, date = map2(start, end, ~ seq(.x, .y, by = 'month'))) %>%
unnest
# id date
#1 2 2018-10-01
#2 2 2018-11-01
#3 2 2018-12-01
#4 3 2018-01-01
#5 3 2018-02-01
#6 3 2018-03-01
#7 3 2018-04-01
Upvotes: 3
Reputation: 388817
Most straightforward way using base R functions is to create a seq
uence of monthly dates for each row, create a dataframe and rbind
them together
do.call(rbind, with(df,lapply(1:nrow(df), function(i)
data.frame(id = id[i], date = seq(as.Date(start[i]), as.Date(end[i]), by = "month")))))
# id date
#1 2 2018-10-01
#2 2 2018-11-01
#3 2 2018-12-01
#4 3 2018-01-01
#5 3 2018-02-01
#6 3 2018-03-01
#7 3 2018-04-01
Upvotes: 3