vokey588
vokey588

Reputation: 203

R code to detect a change in a variable over time for multiple patients

I have a data set with multiple rows per patient, where each row represents a 1-week period of time over the course of 4 months. There is a variable grade that can take on values of 1,2,or 3, and I want to detect when a single patient's grade INCREASES (1 to 2, 1 to 3, or 2 to 3) at any point (the result would be a yes/no variable). I could write a function to do it but I'm betting there is some clever functional programming I could do to make use of existing R functions. Here is a sample data set below. Thank you!

df=data.frame(patient=c(1,1,1,2,2,3,3,3,3),period=c(1,2,3,1,3,1,3,4,5),grade=c(1,1,1,2,3,1,1,2,3))

what I would want is a resulting data frame of:

data.frame(patient=c(1,2,3),grade.increase=c(0,1,1))

Upvotes: 1

Views: 1466

Answers (2)

ulfelder
ulfelder

Reputation: 5335

If you feel like doing this in base R, here's a solution that uses the split-apply-combine approach.

  • You use split to make a list with a separate data frame for each patient;
  • you use lapply to iterate a summarization function over each list element, where the summarization function uses diff to look at changes in grade and if and any to summarize; and then
  • you wrap the whole thing in do.call(rbind, ...) to collapse the resulting list into a data frame.

Here's what that looks like:

do.call(rbind, lapply(split(df, df[,"patient"]), function(i) {

    data.frame(patient = i[,"patient"][1],
               grade.increase = if (any(diff(i[,"grade"]) > 0)) 1 else 0 )

}))

Result:

  patient grade.increase
1       1              0
2       2              1
3       3              1

Upvotes: 0

Fnguyen
Fnguyen

Reputation: 1177

library(dplyr)

df %>%
  arrange(patient, period) %>%
  mutate(grade.increase = case_when(grade > lag(grade) ~ TRUE,TRUE ~ FALSE)) %>%
  group_by(patient) %>%
  summarise(grade.increase = max(grade.increase))

Combining lag which checks the previous value with case_when allows us to identify each grade.increase.

Summarising the maximum of grade.increase for each patient gets the desired results as boolean calculations treat FALSE as 0 and TRUE as 1.

Upvotes: 4

Related Questions