user21637271
user21637271

Reputation: 31

Error message in Fine Gray regression analysis in R (package cmprsk)

I am working on competing risk analysis in R thanks to a Fine & Gray regression analysis. Here is my code with death as the competing risk:

fg.model <- crr(ftime,fstatus,cov,failcode=1,cencode=0)

ftime is a numerical variable ranging from 1 to 180 days that indicates the period of follow-up of patients until their death (fstatus==1). If they are still alive until the end of the follow-up, this variable is equal to 180 days and their status is equal to 0. In summary, If a person dies after 30 days of follow-up, the variable ftime will therefore be equal to 30 days and the variable fstatus to 1 and if a person is still alive after the end of the follow-up (max 180 days) and did not die during it, the variable ftime will therefore be equal to 180 and the variable fstatus to 0. fstatus is also a numerical variable.
The parameter "cov" is a dataframe with two covariates (age and sex converted in factors). Failcode is equal to 1 as death is the competing event and cencode is equal to 0 as survivors as considered as censored.

I have the following error message:

# NAs introduced by coercion Error in crr(ftime,fstatus,cov,failcode=1, : 
# NAs introduced by coercion NA/NaN/inf in foreign function call(arg4)

Since I have no missing data in my database, what can explain this error message and how can I solve it?

I already tried to use na.omit, complete.case, and other code to be sure that there is no missing data in my code. I also check the structure of the data but time and status are well numerical and cov converted in factors.

Here is a code reproducing my dataset and the error message:

# Set the sample size
n<- 8076
# CReate a variable for follow-up time
time<- c(rep(180,6533),sample(1:179,n-6533,replace = TRUE))
# Create a variable for status
status<-ifelse(time==180,0,1) # O= alive/censored
                              # 1 = death
# Create age
    age<-sample(18:90,n,replace = TRUE)
# Create gender
sex<-sample(c("male","female"),n,replace = TRUE)
# Combine
df<-data.frame(time,status,age,sex)
# Create cov
cov<-subset(df,select = c("age","sex"))
cov$sex<-as.factor(cov$sex)
# Run Fine Gray model
library(cmprsk)
fg.mod <-crr(df$time,df$status,cov,failcode = 1,cencode = 0) 
                                                                                                           

Upvotes: 1

Views: 771

Answers (1)

SamR
SamR

Reputation: 20492

Pass your own design matrix to the model using model.matrix()

The solution to the question in your comment - what happens if you have a factor with several levels - is to do this:

# Create factor with three levels
cov$income <- factor(sample(c("high", "med", "low"), nrow(cov), replace = TRUE))

# Define factors for model spec
factors <- c("sex", "income")
model_spec <- reformulate(
    paste0("age+", paste(factors, collapse = "+"))
) # ~age + sex + income

covariates_matrix <- model.matrix(
    model_spec,
    data = cov,
    contrasts.arg = lapply(cov[factors], contrasts)
)[, -1] # first column is constant intercept (1)

head(covariates_matrix, 3)
#   age sexmale incomelow incomemed
# 1  60       1         0         1
# 2  39       0         1         0
# 3  23       0         0         0

crr(df$time, df$status, covariates_matrix, failcode = 1, cencode = 0)

# convergence:  TRUE
# coefficients:
#        age    sexmale  incomelow  incomemed
#  0.0002871  0.0506400 -0.0391500 -0.0098910
# standard errors:
# [1] 0.001193 0.050910 0.062340 0.062020
# two-sided p-values:
#       age   sexmale incomelow incomemed
#      0.81      0.32      0.53      0.87

As you can see, your factor variables now have a coefficient for each level, rather than being treated as continuous.

Why you need to do this

The difficulty you have is that the cmprsk::crr() function does not support model formulas. Instead it takes a matrix, which means that all covariates will be coerced to the same class. In your case, your factor variables lead to everything being coerced to a character, and NAs introduced by coercion. As the cmprsk docs state:

The model.matrix function can be used to generate suitable matrices of covariates from factors

In your original question, which had a binary factor, we could just do cov$sex <- cov$sex=="male", and get a binary covariate which is easy to interpret. However, with more levels, we need to use model.matrix(). If you want to change how the reference level is represented, see this question.

Upvotes: 0

Related Questions