Reputation: 323
I'm in a situation where I have a vector full of column names for a really large data frame.
Let's assume: x = c("Name", "address", "Gender", ......, "class" )
[approximatively 100 variables]
Now, I would like to create a formula which I'll eventually use to create a HoeffdingTree
.
I'm creating formula using:
myformula <- as.formula(paste("class ~ ", paste(x, collapse= "+")))
This throws up the following error:
Error in parse(text = x) : :1:360: unexpected 'else' 1:e+spread+prayforsonni+just+want+amp+argue+blxcknicotine+mood+now+right+actually+herapatra+must+simply+suck+there+always+cookies+ever+everything+getting+nice+nigga+they+times+abu+all+alliepickl
The paste
part in the above statement works fine but passing it as an argument to as.formula
is throwing all kinds of weird problems.
Upvotes: 6
Views: 8867
Reputation: 887068
You may try reformulate
reformulate(setdiff(x, 'class'), response='class')
#class ~ Name + address + Gender
where 'x' is
x <- c("Name", "address", "Gender", 'class')
If R keywords are in the 'x', you can do
reformulate('.', response='class')
#class ~ .
Upvotes: 1
Reputation: 57686
The problem is that you have R keywords as column names. else
is a keyword so you can't use it as a regular name.
A simplified example:
s <- c("x", "else", "z")
f <- paste("y~", paste(s, collapse="+"))
formula(f)
# Error in parse(text = x) : <text>:1:10: unexpected '+'
# 1: y~ x+else+
# ^
The solution is to wrap your words in backticks "`" so that R will treat them as non-syntactic variable names.
f <- paste("y~", paste(sprintf("`%s`", s), collapse="+"))
formula(f)
# y ~ x + `else` + z
Upvotes: 11
Reputation: 21497
You can reduce your data-set first
dat_small <- dat[,c("class",x)]
and then use
myformula <- as.formula("class ~ .")
The .
means using all other (all but class) column.
Upvotes: 2