Bits
Bits

Reputation: 276

Calculate P-value in Sklearn using python?

I'm new to machine learning and created a logistic model using sklearn but i don't get any documentation on how to find P-value for my feature variables as well as model. I have checked the stack link but don't get the required output. please help. Thanks in advance

Upvotes: 7

Views: 25049

Answers (1)

rnso
rnso

Reputation: 24623

One can use regressors package for this. Following code is from: https://regressors.readthedocs.io/en/latest/usage.html

import numpy as np
from sklearn import datasets
boston = datasets.load_boston()
which_betas = np.ones(13, dtype=bool)
which_betas[3] = False  # Eliminate dummy variable
X = boston.data[:, which_betas]
y = boston.target

from sklearn import linear_model
from regressors import stats
ols = linear_model.LinearRegression()
ols.fit(X, y)

# To calculate the p-values of beta coefficients: 
print("coef_pval:\n", stats.coef_pval(ols, X, y))

# to print summary table:
print("\n=========== SUMMARY ===========")
xlabels = boston.feature_names[which_betas]
stats.summary(ols, X, y, xlabels)

Output:

coef_pval:
 [2.66897615e-13 4.15972994e-04 1.36473287e-05 4.67064962e-01
 1.70032518e-06 0.00000000e+00 7.67610259e-01 1.55431223e-15
 1.51691918e-07 0.00000000e+00 0.00000000e+00 0.00000000e+00
 0.00000000e+00]

=========== SUMMARY ===========
Residuals:
Min      1Q  Median      3Q      Max
-26.3743 -1.9207  0.6648  2.8112  13.3794

Coefficients:
             Estimate  Std. Error  t value   p value
_intercept  36.925033    4.915647   7.5117  0.000000
CRIM        -0.112227    0.031583  -3.5534  0.000416
ZN           0.047025    0.010705   4.3927  0.000014
INDUS        0.040644    0.055844   0.7278  0.467065
NOX        -17.396989    3.591927  -4.8434  0.000002
RM           3.845179    0.272990  14.0854  0.000000
AGE          0.002847    0.009629   0.2957  0.767610
DIS         -1.485557    0.180530  -8.2289  0.000000
RAD          0.327895    0.061569   5.3257  0.000000
TAX         -0.013751    0.001055 -13.0395  0.000000
PTRATIO     -0.991733    0.088994 -11.1438  0.000000
B            0.009827    0.001126   8.7256  0.000000
LSTAT       -0.534914    0.042128 -12.6973  0.000000
---
R-squared:  0.73547,    Adjusted R-squared:  0.72904
F-statistic: 114.23 on 12 features

Upvotes: 7

Related Questions