Reputation: 1285
Here is a representation of my dataset
set.seed(1)
library (tidyverse)
Date<-c(1:6,1:8,1:10)
ID<-c(rep(1,3*2),rep(2,4*2),rep(3,5*2))
Surgery<-c(c("Surg1",NA,NA,NA,"Surg2",NA),
c(NA,NA,NA,"Surg.a",NA,NA,"Surg.f",NA),
c("Surg.C",NA,NA,"Surg.A",NA,NA,"Surg.X",NA,NA,NA))
Complication<-sample(c(rep("Infection",8),rep("Pain",7),rep("bleeding",5),rep("Oedema",4)))
NumberOfSurgery<-c(rep(2,6),rep(2,8),rep(3,10))
OrderOfSurgery<-c(1,rep(NA,3),2,rep(NA,4),1,NA,NA,2,NA,1,NA,NA,2,NA,NA,3,rep(NA,3))
mydata<-data.frame(ID,Date,Surgery,Complication,NumberOfSurgery,OrderOfSurgery)
mydata
There are three patients. The first one has benefited from two surgeries over time, the second one 2 also and the last one three surgeries.
I would like to select all complications from the first day of surgery 1 to the first day of surgery 2, and for all individuals; in order to have such a dataset below
ID Date Surgery Complication NumberOfSurgery OrderOfSurgery
1 1 1 Surg1 Infection 2 1
2 1 2 <NA> Infection 2 NA
3 1 3 <NA> Infection 2 NA
4 1 4 <NA> Infection 2 NA
5 1 5 Surg2 Pain 2 2
10 2 4 Surg.a bleeding 2 1
11 2 5 <NA> Pain 2 NA
12 2 6 <NA> Infection 2 NA
13 2 7 Surg.f bleeding 2 2
15 3 1 Surg.C Pain 3 1
16 3 2 <NA> Pain 3 NA
17 3 3 <NA> Pain 3 NA
18 3 4 Surg.A bleeding 3 2
Here is how I proceeded: The dates are already arranged
Firstly, I suppressed for the second individuals all the observation that occurred before the first surgery (that is from the date 1 to the date 3).
mydata2<-mydata%>%mutate(Surgery2=Surgery,OrderOfSurgery2=OrderOfSurgery)%>%
group_by(ID)%>%fill(Surgery2,OrderOfSurgery2)%>%filter(!is.na(Surgery2))
Then I could select only all observations that follow the first surgery, by doing this:
mydata3<-mydata2%>%filter(OrderOfSurgery2==1)
What I want to do is to include the row at the date of the second surgery,as I mentionned above.
Upvotes: 0
Views: 143
Reputation: 1474
Using lag()
and shift OrderOfSurgery2
down by one position, filling the blank with the default value of 1 should do.
mydata3 <- mydata2 %>%
mutate(OrderOfSurgery2 = lag(OrderOfSurgery2, default = 1)) %>%
filter(OrderOfSurgery2 == 1)
# # A tibble: 13 x 8
# # Groups: ID [3]
# ID Date Surgery Complication NumberOfSurgery OrderOfSurgery Surgery2
# <dbl> <int> <chr> <chr> <dbl> <dbl> <chr>
# 1 1 1 Surg1 Infection 2 1 Surg1
# 2 1 2 NA Infection 2 NA Surg1
# 3 1 3 NA Infection 2 NA Surg1
# 4 1 4 NA Infection 2 NA Surg1
# 5 1 5 Surg2 Pain 2 2 Surg2
# 6 2 4 Surg.a bleeding 2 1 Surg.a
# 7 2 5 NA Pain 2 NA Surg.a
# 8 2 6 NA Infection 2 NA Surg.a
# 9 2 7 Surg.f bleeding 2 2 Surg.f
# <Omitted>
Upvotes: 1