Bethanie Stauffer
Bethanie Stauffer

Reputation: 1

how can I quickly run multiple glm for different categories within my data?

i have a df that contains data for the nation and would like to see if the relationship between variables is different at smaller geographies (region, state). i have tried using the subset argument of glm but this would be very repetitive as i have many regions and many dependent variables.

ex)

glm(formula = var ~ grade, data = df, subset = region == "north")

but say I have 10 regions and 10 vars and I want to be able to get the glm results either all-together but subset by region, or one region after the other, rather than having to run each glm combination separately. is this possible?

the values for var are all a number 1 - 100 and the values for grade are "A", "B", "c", and "D"

my apologies if this doesn't quite make sense.

I tried using subset generally,

glm(formula = var ~ grade, data = df, subset = region)

but this resulted in a contrast error

Error in 'contrasts<-'('*tmp*', value = contr.funs[1 + isOF[nn]]) : contrasts can be applied only to factors with 2 or more levels

Upvotes: -1

Views: 38

Answers (1)

Ben Bolker
Ben Bolker

Reputation: 226742

I think that

lme4::lmList(formula = var ~ grade | region, data = df)

should do what you want. The specific problem you're having is that one of your regions only has a single grade value for all observations. There are lots of ways to handle this; one way would be to use try() so that hitting an error for that region didn't stop your code entirely (but you'd still have to figure out what values to enter for that region).

Upvotes: 0

Related Questions