Eggywontgrow
Eggywontgrow

Reputation: 11

Invalid power in formula when looping to make lm() models

I'm trying to make different lm models using the same data, but show higher order interactions by changing the exponent. I'm trying to make this in a for loop rather than writing multiple lines. But I'm getting an "invalid power in formula" error.

for (m in 1:5){
  assign(paste("lm.", m, sep = ""), lm(paste("Response ~ (Factor1+Factor2+Factor3+Factor4+Factor5)^", m, sep = "")))
}

Q: What is causing this and how do I fix it?

Secondly, I would like to have the number of factors in the lm function (Factor1+Factor2+...) be pasted in, as the number of Factors changes.

I also have not been able to see if this works yet due to the above error.

I tried:

for (m in 1:n.factors){
  # Make a string to paste in lm()
  assign(paste("lm.paste.", m, sep = ""), paste("Response ~ (", paste(factors, collapse="+"), ")^", m, sep = ""))

  # Paste in string to lm()
  assign(paste("lm.", m, sep = ""), lm(lm.paste.1, data=data.df))
}

Where "factors" is a vector containing strings of my factors ("Factor1", "Factor2" ...). And n.factors is the number of factors.

Q: Is this the correct way of doing this? It feels a little cumbersome.

Maybe there is some way to use

lm(Response ~ ., data = data.df)

And subset the .?

Here is My Data Frame.

Edit:

dput(head(data.df, 20))

structure(list(Factor1 = c(-1, 1, -1, 1, -1, 1, -1, 1, -1, 1, 
-1, 1, -1, 1, -1, 1, -1, 1, -1, 1), Factor2 = c(-1, -1, 1, 1, 
-1, -1, 1, 1, -1, -1, 1, 1, -1, -1, 1, 1, -1, -1, 1, 1), Factor3 = c(-1, 
-1, -1, -1, 1, 1, 1, 1, -1, -1, -1, -1, 1, 1, 1, 1, -1, -1, -1, 
-1), Factor4 = c(-1, -1, -1, -1, -1, -1, -1, -1, 1, 1, 1, 1, 
1, 1, 1, 1, -1, -1, -1, -1), Factor5 = c(-1, -1, -1, -1, -1, 
-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, 1, 1, 1, 1), Response = c(680.45, 
722.48, 702.14, 666.93, 703.67, 642.14, 692.98, 669.26, 491.58, 
475.52, 478.76, 568.23, 444.72, 410.37, 428.51, 491.47, 607.34, 
620.8, 610.55, 638.04), Order = c(17, 30, 14, 8, 32, 20, 26, 
24, 10, 16, 27, 18, 3, 19, 31, 15, 12, 1, 4, 23)), .Names = c("Factor1", 
"Factor2", "Factor3", "Factor4", "Factor5", "Response", "Order"
), row.names = c(NA, 20L), class = "data.frame")

Upvotes: 1

Views: 1067

Answers (1)

Mike Stanley
Mike Stanley

Reputation: 1480

Your problem is simply that 1 isn't an acceptable value for powers in lm:

> lm(mpg ~ cyl^1, mtcars)
Error in terms.formula(formula, data = data) : invalid power in formula

If you change your iterator to use 2:5 then it will run:

> for (m in 2:5) {lm(paste0("mpg ~ cyl^",m), mtcars);print("ok")}
[1] "ok"
[1] "ok"
[1] "ok"
[1] "ok"
> for (m in 1:5) tryCatch({lm(paste0("mpg ~ cyl^",m), mtcars);print("ok")}, error = function(e) warning(e))
[1] "ok"
[1] "ok"
[1] "ok"
[1] "ok"
Warning message:
In terms.formula(formula, data = data) : invalid power in formula

You could write some code that returns an empty string if the value is 1, or "^m" if the value is not 1, and concat that onto the end of your formula if you must have the ^1 version in the loop.

To answer your second question, I would strongly recommend never using assign in this sort of work. Once your lm objects are in the global environment, it's very difficult to get them back to do more things with them. If you wanted to expand your m to 20, or if you wanted to generalise your code for any given m, it would be extremely difficult.

A better option is to store your lm objects in a list. You can do this very simply by changing your for loop into a call to lapply:

> results <- lapply(2:5, function(x) lm(paste0("mpg ~ cyl^",x), mtcars))
> str(results)
List of 4
 $ :List of 12
  ...snip...
 $ :List of 12
  ...snip... 
 $ :List of 12
  ...snip...
 $ :List of 12
  ...snip...

Now you can iterate through your list to do "something" to every model all at once:

> purrr::map(results, function(x) x$coefficients)
[[1]]
(Intercept)         cyl 
   37.88458    -2.87579 

[[2]]
(Intercept)         cyl 
   37.88458    -2.87579 

[[3]]
(Intercept)         cyl 
   37.88458    -2.87579 

[[4]]
(Intercept)         cyl 
   37.88458    -2.87579 

and this is generalisable to any function you want to apply to a list of lm objects, for any m, or indeed for any list of lm objects or even any list of any kind of object, for example:

> purrr::map(results, broom::tidy)
[[1]]
         term estimate std.error statistic      p.value
1 (Intercept) 37.88458 2.0738436 18.267808 8.369155e-18
2         cyl -2.87579 0.3224089 -8.919699 6.112687e-10

[[2]]
         term estimate std.error statistic      p.value
1 (Intercept) 37.88458 2.0738436 18.267808 8.369155e-18
2         cyl -2.87579 0.3224089 -8.919699 6.112687e-10

[[3]]
         term estimate std.error statistic      p.value
1 (Intercept) 37.88458 2.0738436 18.267808 8.369155e-18
2         cyl -2.87579 0.3224089 -8.919699 6.112687e-10

[[4]]
         term estimate std.error statistic      p.value
1 (Intercept) 37.88458 2.0738436 18.267808 8.369155e-18
2         cyl -2.87579 0.3224089 -8.919699 6.112687e-10

Now you have a list of dataframes, where each data frame contains some interesting information about the model. You could use purrr::map to, say, check the p-values in each data frame in turn and do things with them, or create 4 plots of the fitted regression line, or do whatever else you fancy.

Upvotes: 4

Related Questions