student_t
student_t

Reputation: 243

R Shortcut to include all variables with similar name (ie * in Stata)

I was wondering if there was a shortcut or a symbol to include all variables with a similar name.

For instance, if I have a regression and I have 50 time dummies of the form year1, year2, year3 in Stata I can include all of these by writing year*.

Is there a similar functionality in R? I know I can do something like factor(year) to get the same effect but for a specific reason I need to have many time dummies.

Upvotes: 1

Views: 1139

Answers (2)

Ben Bolker
Ben Bolker

Reputation: 226577

A variation on @Lyzander's answer and @G5W's comment, using reformulate():

yearvars <- grep("^year",names(myData), value=TRUE)
form <- reformulate(c("othervar1","othervar2",yearvars),response="stuff")

The result of form will be stuff ~ othervar1 + othervar2 + year1 + year2 + ...

lm(form, data=myData)

Upvotes: 3

LyzandeR
LyzandeR

Reputation: 37879

In R you would use a formula to define which of the dummy variables to include. So, to include year1, year2 and year3 in your model you would create a formula using paste and as.formula:

formula <- as.formula(paste('y ~', paste0('year', 1:3, collapse  = ' + ')))
formula
#y ~ year1 + year2 + year3
lm(formula, data = data)

For an lm model you could skip the as.formula function because the string gets automatically transformed into a formula inside of lm but other models require it.

An alternative is to filter your data.frame to include all the variables you would need and then use y ~ . as the formula.

Upvotes: 3

Related Questions