Reputation: 7
I am trying to fit many survival models using tidymodels, workflow, and purr. I can get this approach to work for other models, e.g., linear regression, but not survival. I have loaded the survival extension to parsnip.
Here is code to
This message does not help me much. I suppose it has something to do with the in-line Surv() object but haven't been able to see how to construct this another way.
I realize that parsnip for survival may still be work-in-progress, but it is unclear to me if this has been implemented and should work or not. Appreciate any help, Thanks.
library(tidyverse)
library(tidymodels)
library(survival)
library(censored)
set.seed(1973)
df <- tibble(id = seq(1:1000))
df <- df %>% mutate( survtime = floor(100*rgamma(n=1000, shape =1 , rate=1)) ,
fail = runif(n=1000) >0.33 ,
a1 = runif(n=1000) >0.1,
a2 = runif(n=1000) >0.5)
head(df)
cox1 <- coxph(data = df, Surv(survtime, event=fail) ~a1)
summary(cox1)
cox2 <- coxph(data = df, Surv(survtime, event=fail) ~a2)
summary(cox2)
A <- c("a1", "a2")
models1 <- map(A,
~workflow() %>%
add_model(linear_reg()) %>%
add_formula(reformulate(.x, response = 'survtime')) %>%
fit(df)
)
models1
#not working
models2 <- map(A,
~workflow() %>%
add_model(proportional_hazards()) %>%
add_formula(reformulate(.x, response = 'Surv(survtime, event=fail)')) %>%
fit(df)
)
#not working
models3 <- map(A,
~workflow() %>%
add_model(proportional_hazards()) %>%
add_formula(as.formula(paste0('Surv(survtime, event=fail) ~ ', .x))) %>%
fit(df)
)
models3
EDITED to change variables X, x1, and x2 to A, a1, and a2 for clarity. Error message is the same:
Error in `fit_xy()': !Models for censored regression must use the formula interface.
However, problem may be version of R or packages. Error happens with version 4.1.2 with survival 3.4-0 parsnip 1.0.1
but code works on another system with R 4.2.2 parsnip 1.0.3 survival 3.3-1
Unfortunately, I am at sysadmin's mercy for updates so I did not check versioning and cannot easily troubleshoot.
Upvotes: 0
Views: 210
Reputation: 14316
Uppercase "X" instead of lower case "x" is the issue.
Your not the first to do this (I stared at it for a few minutes to see it) so don't feel bad.
However... this example is exactly why we have the reprex
package. That would have told you the issue right away since it starts with a fresh session (and would show us the error message)
Knowing's half the battle :-)
library(tidyverse)
library(tidymodels)
library(survival)
library(censored)
set.seed(1973)
df <- tibble(id = seq(1:1000))
df <- df %>% mutate( survtime = floor(100*rgamma(n=1000, shape =1 , rate=1)) ,
fail = runif(n=1000) >0.33 ,
x1 = runif(n=1000) >0.1,
x2 = runif(n=1000) >0.5)
head(df)
#> # A tibble: 6 × 5
#> id survtime fail x1 x2
#> <int> <dbl> <lgl> <lgl> <lgl>
#> 1 1 27 FALSE TRUE FALSE
#> 2 2 77 FALSE TRUE TRUE
#> 3 3 180 FALSE TRUE FALSE
#> 4 4 127 FALSE TRUE TRUE
#> 5 5 69 TRUE TRUE TRUE
#> 6 6 20 FALSE TRUE TRUE
cox1 <- coxph(data = df, Surv(survtime, event=fail) ~x1)
summary(cox1)
#> Call:
#> coxph(formula = Surv(survtime, event = fail) ~ x1, data = df)
#>
#> n= 1000, number of events= 692
#>
#> coef exp(coef) se(coef) z Pr(>|z|)
#> x1TRUE 0.2854 1.3303 0.1250 2.284 0.0224 *
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> exp(coef) exp(-coef) lower .95 upper .95
#> x1TRUE 1.33 0.7517 1.041 1.699
#>
#> Concordance= 0.509 (se = 0.007 )
#> Likelihood ratio test= 5.62 on 1 df, p=0.02
#> Wald test = 5.22 on 1 df, p=0.02
#> Score (logrank) test = 5.25 on 1 df, p=0.02
cox2 <- coxph(data = df, Surv(survtime, event=fail) ~x2)
summary(cox2)
#> Call:
#> coxph(formula = Surv(survtime, event = fail) ~ x2, data = df)
#>
#> n= 1000, number of events= 692
#>
#> coef exp(coef) se(coef) z Pr(>|z|)
#> x2TRUE 0.13210 1.14122 0.07655 1.726 0.0844 .
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> exp(coef) exp(-coef) lower .95 upper .95
#> x2TRUE 1.141 0.8763 0.9822 1.326
#>
#> Concordance= 0.513 (se = 0.011 )
#> Likelihood ratio test= 2.98 on 1 df, p=0.08
#> Wald test = 2.98 on 1 df, p=0.08
#> Score (logrank) test = 2.98 on 1 df, p=0.08
X <- c("x1", "x2")
models1 <- map(X,
~workflow() %>%
add_model(linear_reg()) %>%
add_formula(reformulate(.x, response = 'survtime')) %>%
fit(df)
)
models1
#> [[1]]
#> ══ Workflow [trained] ══════════════════════════════════════════════════════════
#> Preprocessor: Formula
#> Model: linear_reg()
#>
#> ── Preprocessor ────────────────────────────────────────────────────────────────
#> survtime ~ x1
#>
#> ── Model ───────────────────────────────────────────────────────────────────────
#>
#> Call:
#> stats::lm(formula = ..y ~ ., data = data)
#>
#> Coefficients:
#> (Intercept) x1TRUE
#> 121.24 -23.58
#>
#>
#> [[2]]
#> ══ Workflow [trained] ══════════════════════════════════════════════════════════
#> Preprocessor: Formula
#> Model: linear_reg()
#>
#> ── Preprocessor ────────────────────────────────────────────────────────────────
#> survtime ~ x2
#>
#> ── Model ───────────────────────────────────────────────────────────────────────
#>
#> Call:
#> stats::lm(formula = ..y ~ ., data = data)
#>
#> Coefficients:
#> (Intercept) x2TRUE
#> 104.055 -7.706
#not working
models2 <- map(x,
~workflow() %>%
add_model(proportional_hazards()) %>%
add_formula(reformulate(.x, response = 'Surv(survtime, event=fail)')) %>%
fit(df)
)
#> Error in vctrs_vec_compat(.x, .purrr_user_env): object 'x' not found
#not working
models3 <- map(x,
~workflow() %>%
add_model(proportional_hazards()) %>%
add_formula(as.formula(paste0('Surv(survtime, event=fail) ~ ', .x))) %>%
fit(df)
)
#> Error in vctrs_vec_compat(.x, .purrr_user_env): object 'x' not found
models3
#> Error in eval(expr, envir, enclos): object 'models3' not found
Created on 2023-01-06 with reprex v2.0.2
Upvotes: 2