Tim George
Tim George

Reputation: 11

GLMMTMB syntax and model selection

I have daily price (longitudinal) data observed over 5 years for 300 products in 10 stores in 3 US states. 2 states have 3 stores each and one state has 4 stores. I am trying to understand what the random and fixed effects effects are. The predictor variables are a dummy variable that indicates whether or not a particular policy has been enforced in a state and a dummy variable for certain events/national holidays that occur every year (1 for all the days in a week if there was a national holiday during the week, 0 otherwise).

I want to study the effect of the policy dummy, event dummy and their interaction on product prices. Random effects are items and stores. I only have 3 states in my data - h it can't be a random effect and the policy dummy is collinear with the states, hence i don't think it can be included in the model.

Questions:

  1. can the events dummy be a time variable that I can include a random slope such as (event|item_id)?

  2. In the syntax of GLMMTMB in R, do I also include random effects as fixed effects even if they are not? I watched a tutorial that said all RE must be included as FE as well in lmer, hence the doubt.

  3. The residuals are non-normal and positively skewed in lmer package in R, hence I use glmmTMB() with log link and gamma family. The residuals are still non-normal and the qqplot is s-shaped. Does it make sense to use a log-normal family? What other options do I have? I have around 6 million observations (i subset based on product category and each category has around 3 million observations each) - is it okay if the residuals are non-normal?

  4. Is the following model selection approach sensible?

4.1.

null<-glmmTMB(sell_price ~ 1 + (1 | store_id) + (1 | item_id), 
              data = model_food, 
              family = Gamma(link = "log"))

4.2.

    event_as_fixed<-glmmTMB(sell_price ~ 1 +  event+(1 | store_id) +(1 |item_id), 
data =model_food, family = Gamma(link = "log"))

4.3. if event dummy can be considered a proxy for time

 event_as_random <- glmmTMB(sell_price ~ 1 +  (event|item_id)+(1 | store_id) + (1 |item_id), data = model_food, family = Gamma(link = "log")) 

4.5. depending on 4.2 and 4.3 select event dummy as random slope or fixed and then add other fixed effects.

  1. should I include (store/state) since stores are nested in states? products are crossed in stores

Upvotes: 0

Views: 23

Answers (0)

Related Questions