Reputation: 7832
I want to extract information from model formulas, especially I want to remove random effects to just get the "fixed effects part" from mixed models (lme4-notation).
To do this, I search for the last +
in the formula before a paranthesis (
is found. Everything until the +
must be the "fixed" part of the formula. This works fine for models with fixed effects predictors / variables.
However, for null-models (intercept-only in the fixed effects), there might be no +
, e.g. if the formula is Reaction ~ (Days | Subject)
. In this case, I check if there is no +
-sign. But this does not work for models with multiple random parts. In the below examples, the grepl()
for f2
should return FALSE
, but returns TRUE
, because the +
is found for the second opening paranthesis in the random parts.
My question: How can I stop checking for +
after the first (
, so that a possible second or third random effect term is ignored? The goal is for the below example that the grepl()
-commands return FALSE
, FALSE
, TRUE
, TRUE
.
f1 <- "Reaction ~ (1 + Days | Subject)"
f2 <- "Reaction ~ (1 | mygrp/mysubgrp) + (1 | Subject)"
f3 <- "Reaction ~ x1 + x2 + (1 + Days | Subject)"
f4 <- "Reaction ~ x1 + x2 + (1 | mygrp/mysubgrp) + (1 | Subject)"
# works!
grepl("\\+(\\s)*\\((.*)\\)", f1) # should return FALSE
#> [1] FALSE
# fails...
grepl("\\+(\\s)*\\((.*)\\)", f2) # should return FALSE
#> [1] TRUE
# works!
grepl("\\+(\\s)*\\((.*)\\)", f3) # should return TRUE
#> [1] TRUE
# works!
grepl("\\+(\\s)*\\((.*)\\)", f4) # should return TRUE
#> [1] TRUE
Upvotes: 2
Views: 368
Reputation: 8846
Oliver's answer is correct, especially as you're already using lme4
, but there is also a base
framework for modifying formulae that can be used.
# Is read as class formula
f4 <- Reaction ~ x1 + x2 + (1 | mygrp/mysubgrp) + (1 | Subject)
# Isolate the terms and find which contains a vertical bar
f4t <- terms(f4)
dr <- grep("|", labels(f4t), fixed=TRUE)
# Drop the term(s) containing a vertical bar
f4td <- drop.terms(f4t, dr)
# Update the old formula with the new set of terms
f4u <- update(f4, f4td)
# Voilà
f4u
# Reaction ~ x1 + x2
As mentioned in the comments, this will fail in two particular cases: all effects are random, and no effects are random. To handle these exceptions properly I found it best to write a proper function while I'm at it.
drop_randfx <- function(form) {
form.t <- terms(form)
dr <- grepl("|", labels(form.t), fixed=TRUE)
if (mean(dr) == 1) {
form.u <- update(form, . ~ 1)
} else {
if (mean(dr) == 0) {
form.u <- form
} else {
form.td <- drop.terms(form.t, which(dr))
form.u <- update(form, form.td)
}
}
form.u
}
This passes all the tests
f1 <- Reaction ~ (1 + Days | Subject)
f2 <- Reaction ~ (1 | mygrp/mysubgrp) + (1 | Subject)
f3 <- Reaction ~ x1 + x2 + (1 + Days | Subject)
f4 <- Reaction ~ x1 * x2 + (1 | mygrp/mysubgrp) + (1 | Subject)
f5 <- Reaction ~ x1 + x2
sapply(list(f1, f2, f3, f4, f5), drop_randfx) # [[1]]
# [[1]]
# Reaction ~ 1
#
# [[2]]
# Reaction ~ 1
#
# [[3]]
# Reaction ~ x1 + x2
#
# [[4]]
# Reaction ~ x1 + x2 + x1:x2
#
# [[5]]
# Reaction ~ x1 + x2
Upvotes: 2
Reputation: 8572
This is not really answering your question from a RE perspective (for which there likely is an answer), but if your goal is to extract the random effects and/or fixed effect formula's you might gain more from looking at the source code of glFormula
and lFormula
form the lme4
package itself. As they are creating both the design matrix X
and Z
for fixed and random effects respectively, they will have to extract their individual parts at some points.
For example, to extract the fixed effects the function nobars
and RHSForm
are used:
library(lme4)
f1 <- Reaction ~ (1 + Days | Subject)
f2 <- Reaction ~ (1 | mygrp/mysubgrp) + (1 | Subject)
f3 <- Reaction ~ x1 + x2 + (1 + Days | Subject)
f4 <- Reaction ~ x1 + x2 + (1 | mygrp/mysubgrp) + (1 | Subject)
(f1FixedEffects <- nobars(lme4:::RHSForm(f1)) #note the triple 'lme4:::'. RHSForm is not exported to the public environment.
[1] 1
(f2FixedEffects <- nobars(lme4:::RHSForm(f2))
[1] 1
(f1FixedEffects <- nobars(lme4:::RHSForm(f3))
x1 + x2
(f1FixedEffects <- nobars(lme4:::RHSForm(f4))
x1 + x2
If the desire is to extract the entire formula you can use
lme4:::RHSForm(f1) <- nobars(lme4:::RHSForm(f1)
f1
Reaction ~ 1
or similar (thanks to AkselA for his comment)
nobars(f1)
Reaction ~ 1
for the fixed effects.
Note that i converted your string formulas to formulas. This could also just be done with 'as.formula()'
Upvotes: 2