Reputation: 715
Building of this questions: Q
let say i have a dataframe as such:
import pandas as pd
d = {'y':[1.2,2.41,3.12,4.76],'x':['A','B'],'r1':['a','b','c','d'],'r2':['a2','b2','c2','d2']}
df = pd.DataFrame(d)
y is a continuous variable. x is categorical and is the fixed component. It is binary. r1, r2 are categorical. They are the random components.
and i would pass it to the mixed model as such:
import statsmodels.formula.api as smf
md = smf.mixedlm("y ~ x", df, groups=df["r1"], re_formula="~ r1")
this works fine.
But NOW i want to add a second random variable, but that can only be done as a 1D array...
and i don't how to rearrange the data that i pass 2 variables to groups
, as a 1D array
Thus in summary: How to rearrange the dataframe in such a way, so that i can pass 2 variables to groups
as a 1D array? Please show the syntax for this.
Upvotes: 1
Views: 1098
Reputation: 33172
So you need crossed random effects models
.
From the documentation:
Statsmodels MixedLM handles most non-crossed random effects models, and some crossed models. To include crossed random effects in a model, it is necessary to treat the entire dataset as a single group. The variance components arguments to the model can then be used to define models with various combinations of crossed and non-crossed random effects.
Since you need a crossed model with no independent groups, you need to put everyone in the same group and specify the random effects using variance components.
import pandas as pd
import statsmodels.api as sm
d = {'y':[1,2,3,4],'x':[1,2,3,4],'r1':[1,2,3,4],'r2':[1,2,3,4]}
df = pd.DataFrame(d)
df["group"] = 1 # all in the case group
vcf = {"r1": "0 + C(r1)", "r2": "0 + C(r2)"} # formula
model = sm.MixedLM.from_formula("y ~ x", groups="group",
vc_formula=vcf, re_formula="~r1", data=df)
result = model.fit()
Upvotes: 5