Reputation: 129
I have the following Python code where I've already try to perform a VCA analysis using REML:
import pandas as pd
import statsmodels.api as sm
from statsmodels.formula.api import ols
data = {'Part':[1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,3,3,3,3,3,3,3,3,3],
'Employee':[1,1,1,2,2,2,3,3,3,1,1,1,2,2,2,3,3,3,1,1,1,2,2,2,3,3,3],
'Measurement':[103.3, 103.1, 103.1, 103.3, 102.9, 103.6, 103.2, 103.6, 103.1, 104.5, 104.8, 103.9, 104.5, 104, 103.8, 103.8, 103.6, 104, 104, 103.6, 103.5, 103.9, 104.1, 104.5, 104.3, 103.9, 103.8]}
df = pd.DataFrame(data)
lm = ols('Measurement ~ C(Part) + C(Part):(Employee)', data=df).fit()
table = sm.stats.anova_lm(lm, typ=2)
table['Percentage of Total Variance'] = (table['sum_sq'] / table['sum_sq'].sum()) * 100
And then the following result is generated:
When I run this Análises in JMP by running the steps:
And the same result could be achieved in R by the following code:
library('VCA')
library('dplyr')
df <- read.csv('simpler.csv', sep=';')
df$Part <- as.factor(df$Part)
df$Employee <- as.factor(df$Employee)
fitREML <- remlVCA(form=Measurement~Part+Part/Employee, Data = df)
However I cannot get the same result in Python. Is there a way that it could be done in Python without using rpy2?
Detail 1: REML analysis is required for the data that I'm working on.
Detail 2: Nested analysis is required according to R and Python formulas.
Detail 3: I've already tried using mixed models from statsmodels, but the group syntax is not clear.
Upvotes: 6
Views: 463