Reputation: 25
just wondering if anyone can help me with this frustrating problem- im not sure how to go about it as it involves a combination of problems. I'm pretty new to coding so there may be some unclear points in this question- please let me know if something doesn't make sense!
an example of my data
#reproducible data (hopefully)
dat <-structure(list(id = 1:5,
opstart = structure(c(1514970000, 1514904000, 1514916900, 1514883900, 1514979600),
class = c("POSIXct", "POSIXt"), tzone = "UTC"),
crdate_1 = structure(c(1514818380, 1514965080,1514752680, 1514760180, 1514919480),
class = c("POSIXct", "POSIXt"), tzone = "UTC"),
crdate_2 = structure(c(1515058680, 1515058740,1514817780, 1514965080, 1515064560),
class = c("POSIXct", "POSIXt"), tzone = "UTC"),
crdate_3 = structure(c(1515564000, 1515137700, 1514876100, NA, 1516101000),
class = c("POSIXct", "POSIXt"), tzone = "UTC"),
aki_1 = c("aki", NA, NA, NA, "aki"),
aki_2 = c("aki", NA, "aki", NA, "aki"),
aki_3 = c("aki", "aki", "aki", NA, "aki")),
row.names = c(NA, 5L), class = "data.frame")
id opstart crdate_1 crdate_2 crdate_3 aki_1 aki_2 aki_3
1 1 2018-01-03 09:00:00 2018-01-01 14:53:00 2018-01-04 09:38:00 2018-01-10 06:00:00 aki aki aki
2 2 2018-01-02 14:40:00 2018-01-03 07:38:00 2018-01-04 09:39:00 2018-01-05 07:35:00 <NA> <NA> aki
3 3 2018-01-02 18:15:00 2017-12-31 20:38:00 2018-01-01 14:43:00 2018-01-02 06:55:00 <NA> aki aki
4 4 2018-01-02 09:05:00 2017-12-31 22:43:00 2018-01-03 07:38:00 <NA> <NA> <NA> <NA>
5 5 2018-01-03 11:40:00 2018-01-02 18:58:00 2018-01-04 11:16:00 2018-01-16 11:10:00 aki aki aki
what I am trying to do is to make a new column called aki_status using mutate which will detail whether "aki" was before the opdate or after the opdate. crdate_1 corresponds to aki_1, and crdate_2 corresponds to aki_2 etc.
the complication is that the aki_status should be based on the column which "aki" first appears. so for example, for the first row, aki comes up in aki_1, so crdate_1 would be used in the comparison against opdate, but in row three, aki comes up in aki_2 first, so crdate_2 should be used in the comparison against opdate.
the ideal output would be
aki_status(preop, postop, preop, NA, postop)
Upvotes: 0
Views: 44
Reputation: 67020
Here's an approach that relies on first reshaping the original data into a longer format that is easier to work with. The table "spec" here defines which column ("date, "aki", or "obs") we want the columns in the source data to feed into.
library(tidyverse)
spec <- tribble(
~.name, ~.value, ~obs,
"crdate_1", "date", 1,
"crdate_2", "date", 2,
"crdate_3", "date", 3,
"aki_1", "aki", 1,
"aki_2", "aki", 2,
"aki_3", "aki", 3
)
dat_long <- pivot_longer_spec(dat, spec)
This table is easier to work with since it allows us to directly compare each crdate with its respective opstart date, while keeping track of which round it is.
#> dat_long
## A tibble: 15 x 5
# id opstart obs date aki
# <int> <dttm> <dbl> <dttm> <chr>
# 1 1 2018-01-03 09:00:00 1 2018-01-01 14:53:00 aki
# 2 1 2018-01-03 09:00:00 2 2018-01-04 09:38:00 aki
# 3 1 2018-01-03 09:00:00 3 2018-01-10 06:00:00 aki
# 4 2 2018-01-02 14:40:00 1 2018-01-03 07:38:00 NA
# 5 2 2018-01-02 14:40:00 2 2018-01-04 09:39:00 NA
# 6 2 2018-01-02 14:40:00 3 2018-01-05 07:35:00 aki
# 7 3 2018-01-02 18:15:00 1 2017-12-31 20:38:00 NA
# 8 3 2018-01-02 18:15:00 2 2018-01-01 14:43:00 aki
# 9 3 2018-01-02 18:15:00 3 2018-01-02 06:55:00 aki
#10 4 2018-01-02 09:05:00 1 2017-12-31 22:43:00 NA
#11 4 2018-01-02 09:05:00 2 2018-01-03 07:38:00 NA
#12 4 2018-01-02 09:05:00 3 NA NA
#13 5 2018-01-03 11:40:00 1 2018-01-02 18:58:00 aki
#14 5 2018-01-03 11:40:00 2 2018-01-04 11:16:00 aki
#15 5 2018-01-03 11:40:00 3 2018-01-16 11:10:00 aki
Now we can look at the first "aki" within each "id", to get a summary stat "aki_status" for each "id".
results <- dat_long %>%
group_by(id) %>%
filter(aki == "aki") %>%
slice_min(date) %>% # or slice_min(obs) -- same result always?
mutate(aki_status = if_else(date < opstart, "preop", "postop"))
dat %>% left_join(results, by = c("id", "opstart"))
id opstart crdate_1 crdate_2 crdate_3 aki_1 aki_2 aki_3 obs date aki aki_status
1 1 2018-01-03 09:00:00 2018-01-01 14:53:00 2018-01-04 09:38:00 2018-01-10 06:00:00 aki aki aki 1 2018-01-01 14:53:00 aki preop
2 2 2018-01-02 14:40:00 2018-01-03 07:38:00 2018-01-04 09:39:00 2018-01-05 07:35:00 <NA> <NA> aki 3 2018-01-05 07:35:00 aki postop
3 3 2018-01-02 18:15:00 2017-12-31 20:38:00 2018-01-01 14:43:00 2018-01-02 06:55:00 <NA> aki aki 2 2018-01-01 14:43:00 aki preop
4 4 2018-01-02 09:05:00 2017-12-31 22:43:00 2018-01-03 07:38:00 <NA> <NA> <NA> <NA> NA <NA> <NA> <NA>
5 5 2018-01-03 11:40:00 2018-01-02 18:58:00 2018-01-04 11:16:00 2018-01-16 11:10:00 aki aki aki 1 2018-01-02 18:58:00 aki preop
Upvotes: 1