Reputation: 1083
DF_test <- structure(list(`2013` = c(1L, 0L, 1L), `2014` = c(0L, 0L, 2L),
`2015` = c(0L, 0L, 1L), `2016` = c(0L, 0L, 0L), Sum = c(4,
0, 5)), .Names = c("2013", "2014", "2015", "2016", "Sum"), row.names = c(NA, 3L), class = "data.frame")
I'm trying to do a forward stepwise regression as such:
step(lm(Sum~1, data=DF_test), direction="forward", scope=~ 2013 + 2014 + 2015 + 2016)
Unfortunately executing it generates the following error:
Error in terms.formula(tmp, simplify = TRUE) :
invalid model formula in ExtractVars
Can anybody explain to me what this error is and how I can fix this?
Upvotes: 3
Views: 211
Reputation: 16871
Think about what you're using as your scope
argument: 2013 + 2014 + 2015 + 2016
will be evaluated not as a formula referring to names of columns, but just a bunch of numbers being added. That's why it's generally good practice to not have names begin with numbers. You can escape it one of two ways: either use backticks when giving those names, or change the names so they begin with a letter instead. Since these are years, makes sense for them to start with "y".
# with backticks
step(lm(Sum~1, data=DF_test), direction="forward", scope=~ `2013` + `2014` + `2015` + `2016`)
# with better names
names(DF_test)[1:4] <- paste0("y", names(DF_test)[1:4])
step(lm(Sum~1, data=DF_test), direction="forward", scope=~ y2013 + y2014 + y2015 + y2016)
Upvotes: 1
Reputation: 1784
I think you have to define the scope as a lm() object too.
step(lm(Sum~1,data=DF_test), direction="forward", scope= lm(Sum~.,data=DF_test)) #the "." means all variables
This code runs here but no variables are added. It could be because the example data is too simple.
Upvotes: 0