eykanal
eykanal

Reputation: 27047

Difference between ":" and "|" in R linear modeling

When constructing a linear model in R, what is the difference between the following two statements:

lm(y ~ x | z)
lm(y ~ x : z)

The lm function documentation documents the : operator as follows:

A specification of the form first:second indicates the set of terms obtained by taking the interactions of all terms in first with all terms in second.

There's no mention of | syntax on that page. What is the difference?

Upvotes: 8

Views: 4563

Answers (1)

Richie Cotton
Richie Cotton

Reputation: 121127

: is used for interactions. In your example lm(y ~ x : z), the formula means "y is dependent upon an interaction effect between x and z.

Usually, you wouldn't include an interaction in a linear regression like this unless you also included the individual terms x and z as well. x * z is short for x + x:z + z.

AFAIK, | isn't used by lm at all. It certainly doesn't show up in any of the examples in demo("lm.glm", "stats"). It is used in the mixed effects models in the nlme package.

An example from ?intervals.lme:

model <- lme(distance ~ age, Orthodont, random = ~ age | Subject)
ranef(model)

Here the | means "group by". That is, a different random effect for age is fitted for every subject. (Looking at ranef(model), you can see that each row corresponds to the random effects for a person (subject).)

Upvotes: 15

Related Questions