Reputation: 133
I'm trying to take a transformation of the dose predictor, here is my code:
mod = glm(colonies ~ (as.numeric(as.factor(dose)))^(m), data = salmonella, family = "poisson")
where "m" is the power I use. However, I got an error
> mod = glm(colonies ~ (as.numeric(as.factor(dose)))^(m), data = salmonella, family = "poisson")
Error in terms.formula(formula, data = data) : invalid power in formula
Any one knows why?
Sorry for not being clear. Here my m is -0.18182 from an earlier calculation. I now understand I shouldn't use as.numeric(as.factor). But if the code is
mod = glm(colonies ~ (as.factor(dose))^(m), data = salmonella, family = "poisson")
The error is still here. It's weird because when I change m to 2, it works.
Upvotes: 3
Views: 5171
Reputation: 263441
If this is the salmonella
dataset from package:'faraway' then you not use either as.factor or as.numeric on the dose value, since it's already numeric.
Furthermore, the proper way to do polynomial models in R is to use the poly
function rather than forming quadratic terms. If you insist on using "raw" quadratic terms, then it would be easier to do with poly but as Ben has suggested it should be with the I function
library(faraday)
m=2
mod = glm(colonies ~ I(dose^m), data = salmonella, family = "poisson")
Better, however, would be:
m=2; mod = glm(colonies ~ poly(dose, m), data = salmonella, family = "poisson")
Which will give you both the linear and the quadratic terms but the quadratic term will be done as an orthogonal polynomial which then lets you make proper inferences.
Upvotes: 1
Reputation: 226577
tl;dr my best guess is that you should use I(...^m)
to protect ^
/have R treat it as a numerical exponentiation operator.
I found salmonella
in the faraway
package, and can confirm your error. Indeed, it persists through a variety of simplifications.
m <- 1 ## same results with m <- 2L, etc.
mod = glm(colonies ~ (as.numeric(as.factor(dose)))^(m), data = salmonella, family = "poisson")
mod = glm(colonies ~ dose^(m), data = salmonella, family = "poisson")
mod = glm(colonies ~ dose^m, data = salmonella, family = "poisson")
mod = lm(colonies ~ dose^m, data = salmonella)
It looks like R's formula interface does not allow symbolic substitution in a power in a formula.
However: if what you are really trying to do is
dose
to an evenly spaced integer value (0=1, 10=2, 33=3)then using I()
to specify that R should treat ^
as a numeric operator, not an interaction operator in a formula, is what you want:
ss <- transform(salmonella, numdose=as.numeric(as.factor(dose)))
mod = glm(colonies ~ I(numdose^m), data = ss, family = "poisson")
OTOH the picture shows that this isn't completely crazy (although also unnecessary):
library(ggplot2); theme_set(theme_bw())
m <- 2
ggplot(ss,aes(numdose,colonies))+
geom_point()+
geom_smooth(method="glm",method.args=list(family=poisson))+
geom_smooth(method="glm",method.args=list(family=poisson),
formula=y~I(x^m),colour="red")
ggsave("numdose.png")
Upvotes: 3