Reputation: 63
I am quite new to Python. I would like to get a summary of a logistic regression like in R. I have created variables x_train and y_train and I am trying to get a logistic regression
import numpy as np
import matplotlib.pyplot as plt
from sklearn import linear_model
clf = linear_model.LogisticRegression(C=1e5)
clf.fit(x_train, y_train)
What I get is:
LogisticRegression(C=100000.0, class_weight=None, dual=False,
fit_intercept=True, intercept_scaling=1, max_iter=100,
multi_class='ovr', n_jobs=1, penalty='l2', random_state=None,
solver='liblinear', tol=0.0001, verbose=0, warm_start=False)
I would like to have a summary with significative levels, R2 ecc.
Upvotes: 3
Views: 30793
Reputation: 78
import statsmodels.api as sm
x_train1 = sm.add_constant(x_train1)
lm_1 = sm.OLS(y_train, x_train1).fit()
lm_1.summary()
This is a very use full package for the once who are very much used to R's model summary
For more info refer below articles:
Upvotes: 1
Reputation: 4214
I'd recommend taking a look at the statsmodels
library. Sk-learn is great (and the other answers provide ways to get at R2 and other metrics), but statsmodels
provides a regression summary very similar to the one you're probably used to in R.
As an example:
import statsmodels.api as sm
from sklearn.datasets import make_blobs
x, y = make_blobs(n_samples=50, n_features=2, cluster_std=5.0,
centers=[(0,0), (2,2)], shuffle=False, random_state=12)
logit_model = sm.Logit(y, sm.add_constant(x)).fit()
print logit_model.summary()
Optimization terminated successfully.
Current function value: 0.620237
Iterations 5
Logit Regression Results
==============================================================================
Dep. Variable: y No. Observations: 50
Model: Logit Df Residuals: 47
Method: MLE Df Model: 2
Date: Wed, 28 Dec 2016 Pseudo R-squ.: 0.1052
Time: 12:58:10 Log-Likelihood: -31.012
converged: True LL-Null: -34.657
LLR p-value: 0.02611
==============================================================================
coef std err z P>|z| [95.0% Conf. Int.]
------------------------------------------------------------------------------
const -0.0813 0.308 -0.264 0.792 -0.684 0.522
x1 0.1230 0.065 1.888 0.059 -0.005 0.251
x2 0.1104 0.060 1.827 0.068 -0.008 0.229
==============================================================================
If you want to add regularization, instead of calling .fit()
after the Logit initialization you can call .fit_regularized()
and pass in an alpha parameter (regularization strength). If you do this, remember that the C
paramater in sk-learn is actually the inverse of regularization strength.
Upvotes: 7
Reputation: 2096
For obtaining of singificance levels you can use sklearn.feature_selection.f_regression
.
For obtaining R2 you can use sklearn.metrics.r2_score
Upvotes: 1
Reputation: 31659
You can call clf.score(test_samples, true_values)
to get R2.
Significance is not directly provided by sklearn but have at the answer here and this code.
Upvotes: 0