J.A.Cado
J.A.Cado

Reputation: 715

From 2D to 1D, how to pass a second random effect in mixed model [Python, Statsmodel]

Building of this questions: Q

let say i have a dataframe as such:

import pandas as pd
d = {'y':[1.2,2.41,3.12,4.76],'x':['A','B'],'r1':['a','b','c','d'],'r2':['a2','b2','c2','d2']}
df = pd.DataFrame(d)

y is a continuous variable. x is categorical and is the fixed component. It is binary. r1, r2 are categorical. They are the random components.

and i would pass it to the mixed model as such:

import statsmodels.formula.api as smf
md = smf.mixedlm("y ~ x", df, groups=df["r1"], re_formula="~ r1")

this works fine.

But NOW i want to add a second random variable, but that can only be done as a 1D array... and i don't how to rearrange the data that i pass 2 variables to groups, as a 1D array

Thus in summary: How to rearrange the dataframe in such a way, so that i can pass 2 variables to groups as a 1D array? Please show the syntax for this.

Upvotes: 1

Views: 1098

Answers (1)

seralouk
seralouk

Reputation: 33172

So you need crossed random effects models.

From the documentation:

Statsmodels MixedLM handles most non-crossed random effects models, and some crossed models. To include crossed random effects in a model, it is necessary to treat the entire dataset as a single group. The variance components arguments to the model can then be used to define models with various combinations of crossed and non-crossed random effects.


Since you need a crossed model with no independent groups, you need to put everyone in the same group and specify the random effects using variance components.

import pandas as pd                                                                                                        
import statsmodels.api as sm                                                                                               

d = {'y':[1,2,3,4],'x':[1,2,3,4],'r1':[1,2,3,4],'r2':[1,2,3,4]}
df = pd.DataFrame(d)                                                                                                          
df["group"] = 1    # all in the case group                                                                                                        

vcf = {"r1": "0 + C(r1)", "r2": "0 + C(r2)"}  # formula                                                        
model = sm.MixedLM.from_formula("y ~ x", groups="group",                                                    
                                vc_formula=vcf, re_formula="~r1", data=df)                                                   
result = model.fit()  

Upvotes: 5

Related Questions