learner
learner

Reputation: 2742

as.formula in R doesn't seem to accept a name that starts with a number followed by _

How can I avoid getting the following error. The as.formula() does not seem to take a variable starting with a number and followed by _ (underscore). I am generating these variables dynamically and I am at a stage where I do not want to go back and change the variable names. Thanks

lhsOfFormula = "25_dep"
rhsOfFormula  = "predVar1+predVar2+10_predVar3"
as.formula(paste(lhsOfFormula , " ~ ", rhsOfFormula ))

ERROR:

> as.formula(paste(lhsOfFormula , " ~ ", rhsOfFormula ))
Error in parse(text = x) : <text>:1:3: unexpected input
1: 25_
     ^

Upvotes: 4

Views: 6216

Answers (1)

Aaron - mostly inactive
Aaron - mostly inactive

Reputation: 37784

You need to wrap the names in backticks, something like this

> lhsOfFormula <- "25_dep"
> rhsOfFormula <- c("predVar1", "predVar2", "10_predVar3")
> addq <- function(x) paste0("`", x, "`")
> as.formula(paste(addq(lhsOfFormula) , " ~ ", paste(addq(rhsOfFormula),collapse=" + " )))
`25_dep` ~ predVar1 + predVar2 + `10_predVar3`

I also vaguely remember there's a function to help with creating formulas, something like formulate, maybe? But I can't find anything about it in my quick search.

EDIT: Thanks to @DWin, it's reformulate, which helps with the response but not with the predictors. Here the RHS is changed to have a valid name so that the code works:

> lhsOfFormula = "25_dep"
> rhsOfFormula  = c("predVar1", "predVar2", "x10_predVar3")
> reformulate(rhsOfFormula, lhsOfFormula)
`25_dep` ~ predVar1 + predVar2 + x10_predVar3

EDIT: Applying formula directly to a data frame will add the backticks automatically, using the first column as the response

> d <- data.frame(`25_dep`=1:5, predvar1=1:5, predvar2=1:5, `10_predvar3`=1:5, 
                  check.names=FALSE)
> formula(d)
`25_dep` ~ predvar1 + predvar2 + `10_predvar3`

The code for that function (stats:::formula.data.frame) can be adapted; it uses as.name like this:

> lhsOfFormula <- "25_dep"
> rhsOfFormula <- c("predVar1", "predVar2", "10_predVar3")
> ns <- sapply(c(lhsOfFormula, rhsOfFormula), as.name)
> formula(paste(ns[1], paste(ns[-1], collapse="+"), sep=" ~ "))
`25_dep` ~ predVar1 + predVar2 + `10_predVar3`

Upvotes: 10

Related Questions