Tanvi Gurjar
Tanvi Gurjar

Reputation: 1

How to include only specific factor values in a mixed-model formula?

I'm very new to Stack Overflow, and apologize in advance as this is most likely a very basic question, but, from the data below, I would like to incorporate only a few of the treatment groups into my model formula.

A brief look at my data:

1

My model formula: fit <-glmer(Proportion_S~TREATMENT + (1|HOUR), control=glmerControl(optimizer="bobyqa"), data = MotherPExp, family = poisson)

There are a total of 8 levels within "TREATMENT". they are: CCC, CCP, CPC, CPP, PCC, PCP, PPC, PPP. I want to include only specific treatments groups (eg: only: CCC, or CPC or PPC etc.), as the independant variable and not the entire "TREATMENT" column. I tried specifying the different levels as follows:

data.frame(x1 = c("PCP"), x2 = c("PPP*PCP"), x3 = c("CCP"), x4=("CPP"))

While this worked, I am still not able to incorporate it into the formula:

fit <-glmer(Proportion_S~ x1*x2 + x3 + (1|HOUR) + control=glmerControl(optimizer="bobyqa"), data = MotherPExp, family = poisson)

I get the following error message: Error in data.frame(x1*x2 + x3) : object 'x1' not found

The same error message is generated for x1:x4, even if I reduce the indepedant variables.

I would really appreciate any inputs! Thanks!

Upvotes: 0

Views: 38

Answers (1)

MrSwaggins
MrSwaggins

Reputation: 87

I had a different answer up, until I realised you were doing this inside the "TREATMENT" variable(column). The PPP*PCP confused me. You can't cross levels within a variable, you can only cross variables within a formula

You can only use your data column headings inside a glmer() formula.

If you need to only run your model on certain levels you need to remove the ones you don't want from your data first.

library(dplyr)


MotherPExpNew <- filter(MotherPExp, TREATMENT == "PCP" | TREATMENT == "PPP" | TREATMENT == "CCP" | TREATMENT == "CPP") # only these levels used

The vertical line is an OR symbol. So this saying, where TREATMENT is equal to this value OR this value etc, put it in the new dataframe.

Then your model formula will work with the new dataframe/tibble using only those lines of data i.e. those levels.

fit <-glmer(Proportion_S~TREATMENT + (1|HOUR), control = glmerControl(optimizer="bobyqa"), data = MotherPExpNew, family = poisson)


IF, however you need to "program" your formulae with variables/columns, then my old answer will help:

For formulas to work in library(lme4) using variables (not in your data =) you need to use as.formula() and paste() to turn the formula "text" into something it can read. Inside paste anything that is defined outside of your data= should be outside of the quotes. Using what you have supplied (though it is hard to test without a proper set of data being supplied), I would attempt something like this:

fit <-glmer(as.formula(paste("Proportion_S~", x1, "*", x2, "+", x3, "+ (1|HOUR)", control=glmer(Controloptimizer="bobyqa"), data = MotherPExp, family = poisson)))

EDIT you don't need these in a data frame

data.frame(x1 = c("PCP"), x2 = c("PPP*PCP"), x3 = c("CCP"), x4=("CPP"))

This will suffice:

x1 = "PCP"
x2 = "PPP*PCP"
x3 = "CCP"
x4 = "CPP"

Upvotes: 0

Related Questions