Nelly
Nelly

Reputation: 423

Error in model.frame.default: variable lengths differ, R predict function

This is not a new question, I have seen several proposed solutions elsewhere and have tried them, none works, so I ask.

How can I fix this error? I am using R version 3.5.3 (2019-03-11)

Error in model.frame.default(data = ov_val, formula = Surv(time = ov_dev$futime,  : variable lengths differ (found for 'rx')

Here is a reproducible example:

library(survival)
library(survminer)
library(dplyr)

# Create fake development dataset
ov_dev <- ovarian[1:13,]

# Create fake validation dataset
ov_val <- ovarian[13:26,]

# Run cox model
fit.coxph <- coxph(Surv(time = ov_dev$futime, event = ov_dev$fustat) ~ rx + resid.ds + age + ecog.ps, data = ov_dev)

summary(fit.coxph)

# Where error occurs
p <- log(predict(fit.coxph, newdata = ov_val, type = "expected"))

Upvotes: 0

Views: 2332

Answers (1)

Allan Cameron
Allan Cameron

Reputation: 173793

I think this has happened because you have used ov_dev$futime and ov_dev$fustat in your model specification rather than just using futime and fustat. That means that when you come to predict, the model is using the ov_dev data for the dependent variable but ov_val for the independent variables, which are of different length (13 versus 14). Just remove the data frame prefix and trust the data parameter:

library(survival)
library(survminer)
library(dplyr)

# Create fake development dataset
ov_dev <- ovarian[1:13,]

# Create fake validation dataset
ov_val <- ovarian[13:26,]

# Run cox model
fit.coxph <- coxph(Surv(futime, fustat) ~ rx + resid.ds + age + ecog.ps, 
                   data = ov_dev)

p <- log(predict(fit.coxph, newdata = ov_val, type = "expected"))

p
#>  [1]  0.4272783 -0.1486577 -1.8988833 -1.1887086 -0.8849632 -1.3374428
#>  [7] -1.2294725 -1.5021708 -0.3264792  0.5633839 -3.0457613 -2.2476071
#> [13] -1.6754877 -3.0691996

Created on 2020-08-19 by the reprex package (v0.3.0)

Upvotes: 3

Related Questions