Reputation: 1502
Can a survival model with just the treatment as a predictor be fit with a tidymodels survival function?
Here I mention the example, which uses many predictors, then try to duplicated it with only one predictor. This fails.
https://www.tidyverse.org/blog/2021/11/survival-analysis-parsnip-adjacent/ has code to fit a survival tidymodel
library(survival)
bladder_train <- bladder[-c(1:3),]
bladder_test <- bladder[1:3,]
cox_spec <- proportional_hazards(penalty = 0.123) %>%
set_engine("glmnet")
f_fit <- fit(cox_spec,
Surv(stop, event) ~ rx + size + number + strata(enum),
data = bladder_train)
But with only the treatment in the model, it does not work
f_fit <- fit(cox_spec,
Surv(stop, event) ~ rx,
data = bladder_train)
Why? What am I missing
Upvotes: 0
Views: 202
Reputation: 1137
It seems the error has more to do with glmnet than tidymodels. This is the error:
library(survival)
library(censored)
#> Loading required package: parsnip
library(tidymodels)
#> Registered S3 method overwritten by 'tune':
#> method from
#> required_pkgs.model_spec parsnip
bladder_train <- bladder[-c(1:3), ]
bladder_test <- bladder[1:3, ]
cox_spec <- proportional_hazards(penalty = 0.123) %>%
set_engine("glmnet")
f_fit <- fit(cox_spec,
Surv(stop, event) ~ rx,
data = bladder_train)
#> Error in glmnet::glmnet(data_obj$x, data_obj$y, family = "cox", alpha = alpha, : x should be a matrix with 2 or more columns
Created on 2021-12-30 by the reprex package (v2.0.1)
glmnet needs a mtrix with 2 or more columns. Using just rx
means you'd have just one column. If I add size
as an additional feature, it works just fine.
library(survival)
library(censored)
#> Loading required package: parsnip
library(tidymodels)
#> Registered S3 method overwritten by 'tune':
#> method from
#> required_pkgs.model_spec parsnip
bladder_train <- bladder[-c(1:3), ]
bladder_test <- bladder[1:3, ]
cox_spec <- proportional_hazards(penalty = 0.123) %>%
set_engine("glmnet")
f_fit <- fit(cox_spec,
Surv(stop, event) ~ rx + size,
data = bladder_train)
f_fit
#> parsnip model object
#>
#> Fit time: NA
#>
#> Call: glmnet::glmnet(x = data_obj$x, y = data_obj$y, family = "cox", alpha = alpha, lambda = lambda)
#>
#> Df %Dev Lambda
#> 1 0 0.00 0.070850
#> 2 1 0.10 0.064560
#> 3 1 0.18 0.058820
#> 4 1 0.24 0.053600
#> 5 1 0.30 0.048840
#> 6 2 0.35 0.044500
#> 7 2 0.43 0.040550
#> 8 2 0.50 0.036940
#> 9 2 0.55 0.033660
#> 10 2 0.60 0.030670
#> 11 2 0.64 0.027950
#> 12 2 0.67 0.025460
#> 13 2 0.70 0.023200
#> 14 2 0.72 0.021140
#> 15 2 0.74 0.019260
#> 16 2 0.75 0.017550
#> 17 2 0.77 0.015990
#> 18 2 0.78 0.014570
#> 19 2 0.79 0.013280
#> 20 2 0.79 0.012100
#> 21 2 0.80 0.011020
#> 22 2 0.81 0.010040
#> 23 2 0.81 0.009151
#> 24 2 0.81 0.008338
#> 25 2 0.82 0.007597
#> 26 2 0.82 0.006922
#> 27 2 0.82 0.006308
#> 28 2 0.82 0.005747
#> 29 2 0.82 0.005237
#> 30 2 0.83 0.004771
#> 31 2 0.83 0.004348
#> 32 2 0.83 0.003961
#> 33 2 0.83 0.003609
#> 34 2 0.83 0.003289
#> 35 2 0.83 0.002997
#> 36 2 0.83 0.002730
#> 37 2 0.83 0.002488
#> 38 2 0.83 0.002267
#> 39 2 0.83 0.002065
#> 40 2 0.83 0.001882
#> 41 2 0.83 0.001715
#> The training data has been saved for prediction.
Created on 2021-12-30 by the reprex package (v2.0.1)
If you wanted to just use one feature rx
, consider other models e.g. decision trees
library(survival)
library(censored)
#> Loading required package: parsnip
library(tidymodels)
#> Registered S3 method overwritten by 'tune':
#> method from
#> required_pkgs.model_spec parsnip
bladder_train <- bladder[-c(1:3), ]
bladder_test <- bladder[1:3, ]
dt_spec <- decision_tree() %>%
set_engine("rpart") %>%
set_mode("censored regression")
f_fit <- fit(dt_spec,
Surv(stop, event) ~ rx,
data = bladder_train)
f_fit
#> parsnip model object
#>
#> $rpart
#> n= 337
#>
#> node), split, n, deviance, yval
#> * denotes terminal node
#>
#> 1) root 337 403.0968 1.0000000
#> 2) rx>=1.5 152 166.6335 0.7751669 *
#> 3) rx< 1.5 185 231.2927 1.1946030 *
#>
#> $survfit
#>
#> Call: prodlim::prodlim(formula = form, data = data)
#> Stratified Kaplan-Meier estimator for the conditional event time survival function
#> Discrete predictor variable: rpartFactor (0.775166899958249, 1.19460305107131)
#>
#> $levels
#> [1] "0.775166899958249" "1.19460305107131"
#>
#> attr(,"class")
#> [1] "pecRpart"
Created on 2021-12-30 by the reprex package (v2.0.1)
Upvotes: 2