Reputation: 12128
I'm trying to access the values of skew
and kurtosis
from an OLS Regression using statsmodels.formula.api.ols
Here's my work on a sample dataframe:
# First initialize the df
import pandas as pd
import numpy as np
np.random.seed(11)
df = pd.DataFrame({'Group':np.random.randint(1,4,100),'Score_1':np.random.randint(1,100,100),'Score_2':np.random.randint(1,200,100)})
df['Score_1'] = df['Score_1']*df['Group'] * np.random.random_sample(100)
df['Score_2'] = df['Score_1']*df['Score_2']
# -----------------------------------
# Next, apply ols regression loopwise:
from statsmodels.formula.api import ols
records = []
for col in ['Score_1','Score_2']:
mod = ols(f'{col} ~ C(Group)',data=df).fit()
# If we only care about significant differences
# if (mod.f_pvalue<=0.05):
i = 0
for gen in sorted(df['Group'].unique()):
rec = {'variable':col,
'f_pvalue': mod.f_pvalue,
'group': gen,
'mean':mod.params[i],
'conf int lower':mod.conf_int().values[i][0],
'conf int upper':mod.conf_int().values[i][1],
'p value': mod.pvalues[i],
'Log-Likelihood':mod.llf,
# **I'm trying to access the value for the item below:**
# 'Skew':mod.diagn['skew'],
}
records.append(rec)
i+=1
As demonstrated in the code above, I'm having trouble accessing these specific items from the model.
Upvotes: 1
Views: 1117
Reputation: 497
I am assuming you are looking for the skewness of your residuals. Your mod
is a RegressionResults
object and it has no diagn
attribute (see docs). Instead you can use the skew
function from scipy
from scipy.stats import skew
records = []
for col in ['Score_1','Score_2']:
mod = ols(f'{col} ~ C(Group)', data=df).fit()
i = 0
for gen in sorted(df['Group'].unique()):
rec = {
'variable':col,
'f_pvalue': mod.f_pvalue,
'group': gen,
'mean':mod.params[i],
'conf int lower':mod.conf_int().values[i][0],
'conf int upper':mod.conf_int().values[i][1],
'p value': mod.pvalues[i],
'Log-Likelihood':mod.llf,
'Skew': skew(mod.resid_pearson),
}
records.append(rec)
i+=1
Upvotes: 3