Reputation: 2101
I have a quite complex problem I am not able to tackle.
I have a dataframe I read in dplyr:
trans_id date type
9373 2019-09-29 6-months
9945 2019-08-15 3-months
9945 2019-11-13 3-months
9615 2019-12-28 3-months
11465 2019-07-13 3-months
11465 2019-10-11 3-months
reproducible example:
library(tidyverse)
df <- data.frame(stringsAsFactors=FALSE,
id = c(9373, 9945, 9945, 9615, 11465, 11465),
date = c("2019-09-29", "2019-08-15", "2019-11-13", "2019-12-28",
"2019-07-13", "2019-10-11"),
type = c("6-months", "3-months", "3-months", "3-months", "3-months",
"3-months")) %>%
mutate(date = as.Date(date))
Each id
is a transaction, happened on a given date
; each transaction can be either repeated every 3 months or 6 months - as specified in type
.
I want to expand these transactions in their monthly counterparts up to the current date; this means that the first transaction 9373 has to be repeated 6 times with a 30 days cycle (type
== 6-months) starting from 2019-09-29 up to current day (today is 2020-01-07), aka is going to be just 4 single monthly transactions since the last two have to happen yet.
Same for the 3-months transactions, always considering the starting date and the current date.
Example of the final result:
id date type
9373 2019-09-29 6-months # first 6-months cycle transaction
9373 2019-10-29 6-months
9373 2019-11-28 6-months
9373 2019-12-28 6-months
9945 2019-08-15 3-months #
9945 2019-09-14 3-months
9945 2019-10-14 3-months
9945 2019-11-13 3-months #
9945 2019-12-13 3-months
9615 2019-12-28 3-months #
Any help is highly appreciated!
Upvotes: 2
Views: 42
Reputation: 2017
You can use rowwise
and do
like so:
df %>%
rowwise() %>%
do({
p <- as.numeric(gsub('\\D+','',.$type))-1
tibble(
id=.$id,
date=seq(.$date,pmin(Sys.Date(),.$date+p*30),30),
type=.$type
)
}) %>%
ungroup()
# A tibble: 16 x 3
# id date type
# * <dbl> <date> <chr>
# 1 9373 2019-09-29 6-months
# 2 9373 2019-10-29 6-months
# 3 9373 2019-11-28 6-months
# 4 9373 2019-12-28 6-months
# 5 9945 2019-08-15 3-months
# 6 9945 2019-09-14 3-months
# 7 9945 2019-10-14 3-months
# 8 9945 2019-11-13 3-months
# 9 9945 2019-12-13 3-months
# 10 9615 2019-12-28 3-months
# 11 11465 2019-07-13 3-months
# 12 11465 2019-08-12 3-months
# 13 11465 2019-09-11 3-months
# 14 11465 2019-10-11 3-months
# 15 11465 2019-11-10 3-months
# 16 11465 2019-12-10 3-months
Upvotes: 1
Reputation: 388907
Here is one way using dplyr
and tidyr
functions.
library(dplyr)
library(tidyr)
df %>%
#Extract the number from type column
mutate(num = readr::parse_number(type)) %>%
#For each transcation
group_by(row = row_number()) %>%
#Create a sequence from date till number of months with a break of 30 days
complete(id, type, date = seq(date, by = "30 days", length.out = num)) %>%
#Remove rows which have date value greater than today
filter(date <= Sys.Date()) %>%
ungroup() %>%
select(-num, -row)
# A tibble: 16 x 3
# id type date
# <dbl> <chr> <date>
# 1 9373 6-months 2019-09-29
# 2 9373 6-months 2019-10-29
# 3 9373 6-months 2019-11-28
# 4 9373 6-months 2019-12-28
# 5 9945 3-months 2019-08-15
# 6 9945 3-months 2019-09-14
# 7 9945 3-months 2019-10-14
# 8 9945 3-months 2019-11-13
# 9 9945 3-months 2019-12-13
#10 9615 3-months 2019-12-28
#11 11465 3-months 2019-07-13
#12 11465 3-months 2019-08-12
#13 11465 3-months 2019-09-11
#14 11465 3-months 2019-10-11
#15 11465 3-months 2019-11-10
#16 11465 3-months 2019-12-10
Upvotes: 1