Reputation: 63
I want to add a dataframe column of integers to dataframe columns of dates in the same row, putting the result into new columns. Then count days from start to events in the same row, putting those into new columns. Have many date columns and many more than 5 rows in the dataframe, so a loop or something like that might be good.
This is what I have:
havedf <- data.frame(id = c("100","103","104","107","109"),
startDate = c("2015-04-02","2017-01-10","2016-02-11","2012-01-16","2014-03-05"),
eventOneDate = c("2015-04-10","2017-03-13","2016-03-04","2012-03-06","2014-03-11"),
eventTwoDate = c("2015-05-02","2017-08-12","2016-05-09","2012-04-05","2014-06-09"),
extradays = c(12,9,16,5,10))
This is what I want:
wantdf <- data.frame(id = c("100","103","104","107","109"),
startDate = c("2015-04-02","2017-01-10","2016-02-11","2012-01-16","2014-03-05"),
eventOneDate = c("2015-04-10","2017-03-13","2016-03-04","2012-03-06","2014-03-11"),
eventTwoDate = c("2015-05-02","2017-08-12","2016-05-09","2012-04-05","2014-06-09"),
extradays = c(12,9,16,5,10),
startDate_w_extra = c("2015-04-14","2017-01-19","2016-02-27","2012-01-21","2014-03-15"),
eventOneDate_w_extra = c("2015-04-22","2017-03-22","2016-03-20","2012-03-11","2014-03-21"),
eventTwoDate_w_extra = c("2015-05-14","2017-08-21","2016-05-25","2012-04-10","2014-06-19"),
daysToEventOne = c(8,62,22,50,6),
daysToEventTwo = c(30,214,88,80,96))
Upvotes: 1
Views: 202
Reputation: 2949
A simplified version of @TarJae 's answer:
library(lubridate)
library(dplyr)
havedf %>%
mutate(across(contains("Date"), ymd)) %>%
mutate(across(contains("Date"), .fns = list(w_extra = ~ .+extradays)),
daysToEventOne = as.integer(eventOneDate - startDate),
daysToEventTwo = as.integer(eventTwoDate - startDate))
id startDate eventOneDate eventTwoDate extradays startDate_w_extra eventOneDate_w_extra eventTwoDate_w_extra
1 100 2015-04-02 2015-04-10 2015-05-02 12 2015-04-14 2015-04-22 2015-05-14
2 103 2017-01-10 2017-03-13 2017-08-12 9 2017-01-19 2017-03-22 2017-08-21
3 104 2016-02-11 2016-03-04 2016-05-09 16 2016-02-27 2016-03-20 2016-05-25
4 107 2012-01-16 2012-03-06 2012-04-05 5 2012-01-21 2012-03-11 2012-04-10
5 109 2014-03-05 2014-03-11 2014-06-09 10 2014-03-15 2014-03-21 2014-06-19
daysToEventOne daysToEventTwo
1 8 30
2 62 214
3 22 88
4 50 80
5 6 96
If you have more than two events and do not want to select the event date columns, you can use this:
havedf %>%
mutate(across(contains("Date"), ymd)) %>%
mutate(across(contains("Date"), .fns = list(w_extra = ~ .+extradays)),
across(contains("event") & !contains("w_extra"), .fns = list(daysTo = ~ .- startDate)))
Output:
id startDate eventOneDate eventTwoDate extradays startDate_w_extra eventOneDate_w_extra eventTwoDate_w_extra
1 100 2015-04-02 2015-04-10 2015-05-02 12 2015-04-14 2015-04-22 2015-05-14
2 103 2017-01-10 2017-03-13 2017-08-12 9 2017-01-19 2017-03-22 2017-08-21
3 104 2016-02-11 2016-03-04 2016-05-09 16 2016-02-27 2016-03-20 2016-05-25
4 107 2012-01-16 2012-03-06 2012-04-05 5 2012-01-21 2012-03-11 2012-04-10
5 109 2014-03-05 2014-03-11 2014-06-09 10 2014-03-15 2014-03-21 2014-06-19
eventOneDate_daysTo eventTwoDate_daysTo
1 8 days 30 days
2 62 days 214 days
3 22 days 88 days
4 50 days 80 days
5 6 days 96 days
Upvotes: 2
Reputation: 35554
With dplyr
, you can use across()
to simplify repeated column manipulations.
library(dplyr)
havedf %>%
mutate(across(contains("Date"), as.Date),
across(contains("Date"), `+`, extradays, .names = "{.col}_w_extra"),
across(matches("^event.+Date$"), `-`, startDate, .names = "daysTo{.col}"))
# # A tibble: 5 × 10
# id startDate eventOneDate eventTwoDate extradays startDate_w_extra eventOneDate_w_extra eventTwoDate_w_extra daysToeventOneDate daysToeventTwoDate
# <chr> <date> <date> <date> <dbl> <date> <date> <date> <drtn> <drtn>
# 1 100 2015-04-02 2015-04-10 2015-05-02 12 2015-04-14 2015-04-22 2015-05-14 8 days 30 days
# 2 103 2017-01-10 2017-03-13 2017-08-12 9 2017-01-19 2017-03-22 2017-08-21 62 days 214 days
# 3 104 2016-02-11 2016-03-04 2016-05-09 16 2016-02-27 2016-03-20 2016-05-25 22 days 88 days
# 4 107 2012-01-16 2012-03-06 2012-04-05 5 2012-01-21 2012-03-11 2012-04-10 50 days 80 days
# 5 109 2014-03-05 2014-03-11 2014-06-09 10 2014-03-15 2014-03-21 2014-06-19 6 days 96 days
Upvotes: 3
Reputation: 78917
Here is how we can do it:
ymd()
function from lubridate
package.Then you could do simple maths:
library(lubridate)
library(dplyr)
havedf %>%
mutate(across(contains("Date"), ymd)) %>%
mutate(startDate_w_extra = startDate+extradays,
eventOneDate_w_extra = eventOneDate + extradays,
eventTwoDate_w_extra = eventTwoDate + extradays,
daysToEventOne = as.integer(eventOneDate - startDate),
daysToEventTwo = as.integer(eventTwoDate - startDate)
)
id startDate eventOneDate eventTwoDate extradays startDate_w_extra eventOneDate_w_extra eventTwoDate_w_extra daysToEventOne daysToEventTwo
1 100 2015-04-02 2015-04-10 2015-05-02 12 2015-04-14 2015-04-22 2015-05-14 8 30
2 103 2017-01-10 2017-03-13 2017-08-12 9 2017-01-19 2017-03-22 2017-08-21 62 214
3 104 2016-02-11 2016-03-04 2016-05-09 16 2016-02-27 2016-03-20 2016-05-25 22 88
4 107 2012-01-16 2012-03-06 2012-04-05 5 2012-01-21 2012-03-11 2012-04-10 50 80
5 109 2014-03-05 2014-03-11 2014-06-09 10 2014-03-15 2014-03-21 2014-06-19 6 96
Upvotes: 3