reza baneshi
reza baneshi

Reputation: 1

selecting a match control with constrains

I have a data set with a binary outcome (mental) and four independent variables: stroke and gender are binary, age and bmi are continuous. Variables of mental and stroke show whether each subject had a diagnosis of these condtiions or not. I also have two columns showing dates of diagnosis/ censoring of mental and stroke conditions (date_mental and date_stroke). For all subjects, date_stroke > date_mental. I want to estimate risk of stroke after mental problems following these steps:

a) For each case with mental (i.e., mental=1), I want to select a match control (i.e., mental=0). The matching is based on gender, age, and bmi.

b) Then I want to create a new variable called date_mental_adj. For each case (i.e., mental=1), date_mental_adj is the same as date_mental. For exach control (i.e., mental=0), the date_mental_adj should be the same as its own control. This is because I want to follow them from the same time point.

c) I want to calclate the follow up time from date_mental_adj to date_stroke and fit a Cox model using the follow up time and stroke as outcome, and mental as predictor.

I did steps a and b using Matchit package. But some of the follow up times are negative. This means that, for some controls the the date of stroke was before the date of mental of their controls. I want to apply a constrain so that the date_stroke for controls must be after their date_mental_adj.

I simulated the data and used MatchIt to select match controls. The code is provided below. But I cannot apply the constrain.Any thought appreciated.

# seed number
set.seed(10)

# Number of samples
n <- 1000

# Generate outcome
mental <- sample(c(0, 1), n, replace = TRUE)

# Generate independent variables
stroke <- sample(c(0, 1), n, replace = TRUE)
gender <- sample(c(0, 1), n, replace = TRUE)
age <- rnorm(n, mean=50, sd=3)
bmi <- rnorm(n, mean=24, sd=2)

# Generate date variables (stroke always after mental)
date_mental <- as.Date(sample(seq(as.Date("1996-01-01"), as.Date("2019-12-31"), by = "day"), n,       replace = FALSE))
date_stroke= date_mental + 700


# Combine all variables into a data frame
my_data <- data.frame(mental, stroke, gender, age, bmi, date_mental, date_stroke)


# Check the first few rows of the generated data
head(my_data)

# start matching
library(MatchIt)


m.out1 <- matchit(mental ~ gender + age + bmi, method = "nearest", distance = "glm", ratio=1,   data = my_data)

# match data
match_data <- match.data(m.out1)

# create date_mental_adj
match_data <- match_data %>%
group_by(subclass) %>%
mutate(date_mental_adj = ifelse(mental == 1, date_mental, first(date_mental[mental == 1]))) %>%
ungroup()

match_data$date_mental_adj= as.Date(match_data$date_mental_adj)


# create the follow-up time
match_data$follow_up_time= time_length(difftime(match_data$date_stroke, match_data$date_mental_adj),"days") 

Upvotes: 0

Views: 26

Answers (0)

Related Questions