Katy Torres
Katy Torres

Reputation: 137

Longitudinal data: Trying to establish if the subjects have a followup visit

I am trying to analyze longitudinal data. Each subject has come in for our study at least once and up to 3 times. I need to do comparisons of scores across visits to see if their treatments helped diminish the symptoms.

For now, I want to set up columns that indicate if the subject has a follow-up visit.

One column indicating if the subject came for a 2nd visit and another column that indicates if the subject came back for their 3rd visit

What my dataset looks like

visit_id  subject_id   visit_number   Measure1    Measure2 ...
1         Subject1         1
2         Subject2         1
3         Subject1         2
4         Subject3         1
5         Subject1         3

What I tried coding

Using sapply to loop through all the visits by subject ID and populate the columns if that subject has a 2nd visit and if they have a 3rd visit.

I also tried a for loop but in each case I'm not sure how to tell it to loop through all instances of that subject and then select items to compare (i.e the existence of a secific visit number)

sapply(dat$subject_id, function(x) {

if(dat$visit_number == 2) {followup2 <- "yes"
}else {followup2 <- "no"}

if(dat$visit_number == 3) {followup3 <- "yes"
}else {followup3 <- "no"}
})

What I want my dataset to look like

visit_id  subject_id   visit_number     followup2  followup3
1         Subject1         1            yes         yes
3         Subject1         2            yes         yes
5         Subject1         3            yes         yes
2         Subject2         1            yes         no
6         Subject2         2            yes         no
4         Subject3         1            no          no

I intend to use a similar logic to go through each subject and compare their symptoms across visits. Comparing visit 1 and 2 and then comparing visit 2 and 3.

Data

dat <- read.table(header = TRUE, stringsAsFactors = FALSE,
text = "visit_id  subject_id   visit_number
1         Subject1         1
3         Subject1         2
5         Subject1         3
2         Subject2         1
6         Subject2         2
4         Subject3         1")

Upvotes: 0

Views: 307

Answers (2)

rawr
rawr

Reputation: 20811

Since you are repeating the same task over and over, you can make a function to do the work and then loop over the moving parts.

dat <- read.table(header = TRUE, stringsAsFactors = FALSE,
text = "visit_id  subject_id   visit_number
1         Subject1         1
3         Subject1         2
5         Subject1         3
2         Subject2         1
6         Subject2         2
4         Subject3         1")

This function will split visit by each unique id and see if the maximum visit is greater than num

f <- function(id, visit, num) {
  ave(visit, id, FUN = function(x) if (max(x) >= num) 'yes' else 'no')
}

Make some test cases to make sure it is working

with(dat, f(subject_id, visit_number, 1))
# [1] "yes" "yes" "yes" "yes" "yes" "yes"
with(dat, f(subject_id, visit_number, 2))
# [1] "yes" "yes" "yes" "yes" "yes" "no" 
with(dat, f(subject_id, visit_number, 3))
# [1] "yes" "yes" "yes" "no"  "no"  "no" 

Then decide what you need to loop over. You can also assign new columns in your data frame for each loop iteration in one go:

idx <- 2:3

dat[, paste0('followup', idx)] <- lapply(idx, function(x)
  f(dat$subject_id, dat$visit_number, x))

#   visit_id subject_id visit_number followup2 followup3
# 1        1   Subject1            1       yes       yes
# 2        3   Subject1            2       yes       yes
# 3        5   Subject1            3       yes       yes
# 4        2   Subject2            1       yes        no
# 5        6   Subject2            2       yes        no
# 6        4   Subject3            1        no        no

Upvotes: 1

Kreuni
Kreuni

Reputation: 312

Rather than trying to do this all in one go, I'd separate it to first identifying if a subject had a second (or third) visit or not, and then adding a column using that data.

To do the first:

subj_2_vist <- dat$subject_id[dat$visit_number==2]

Now subj_2_visit will be a vector of all visitors who've had a second visit. Then you can use ifelse() to create the new column:

dat$followup2 <- ifelse(dat$subject_id %in% subj_2_visit, "Yes", "No")

The same can be used for three visits by changing the check in the first part.

Upvotes: 1

Related Questions