Reputation: 91
My dataset has 3 variables:
Patient ID Outcome Duration
1 1 3
1 0 4
1 0 5
2 0 2
3 1 1
3 1 2
What I want is the first observation for "Duration" for each patient ID to be carried forward.
That is, for patient #1 I want duration to read 3,3,3 for patient #3 I want duration to read 1, 1.
Upvotes: 2
Views: 391
Reputation: 78832
This is a good job for dplyr
(a data.frame wicked-better successor to plyr
with far better syntax than data.table
):
library(dplyr)
dat %>%
group_by(`Patient ID`) %>%
mutate(Duration=first(Duration))
## Source: local data frame [6 x 3]
## Groups: Patient ID
##
## Patient ID Outcome Duration
## 1 1 1 3
## 2 1 0 3
## 3 1 0 3
## 4 2 0 2
## 5 3 1 1
## 6 3 1 1
Upvotes: 2
Reputation: 56935
Another alternative using plyr
(if you will be doing lots of operations on your dataframe though, and particularly if it's big, I recommend data.table
. It has a steeper learning curve but well worth it).
library(plyr)
ddply(mydf, .(PatientID), transform, Duration=Duration[1]) PatientID
# Outcome Duration
# 1 1 1 3
# 2 1 0 3
# 3 1 0 3
# 4 2 0 2
# 5 3 1 1
# 6 3 1 1
Upvotes: 0
Reputation: 23574
Here is one way with data.table. You take the first number in Duration
and ask R to repeat it for each PatientID
.
mydf <- read.table(text = "PatientID Outcome Duration
1 1 3
1 0 4
1 0 5
2 0 2
3 1 1
3 1 2", header = T)
library(data.table)
setDT(mydf)[, Duration := Duration[1L], by = PatientID]
print(mydf)
# PatientID Outcome Duration
#1: 1 1 3
#2: 1 0 3
#3: 1 0 3
#4: 2 0 2
#5: 3 1 1
#6: 3 1 1
Upvotes: 5