Reputation: 692
Code:
from sklearn.linear_model import LogisticRegression
l = LogisticRegression()
b = l.fit(XT,Y)
print "coeff ",b.coef_
print "intercept ",b.intercept_
Here's the dataset
XT =
[[23]
[24]
[26]
[21]
[29]
[31]
[27]
[24]
[22]
[23]]
Y = [1 0 1 0 0 1 1 0 1 0]
Result:
coeff [[ 0.00850441]]
intercept [-0.15184511
Now I added the same data in spss.Analyse->Regression->Binary Logistic Regression. I set the corresponding Y -> dependent and XT -> Covariates. The results weren't even close. Am I missing something in python or SPSS?
Python-Sklearn
Upvotes: 2
Views: 1299
Reputation: 11
With sklearn
you can also "turn off" the regularization by setting the penalty to None
. Then, no regularization will be applied. This will provide similar results for the logistic regression in sklearn
compared to SPSS.
An example of a logistic regression from sklearn
with 1000 iterations and no penalty is:
from sklearn.linear_model import LogisticRegression
lr = LogisticRegression(max_iter=1000, penalty='none')
Upvotes: 1
Reputation: 692
Solved it myself. I tried changing the C-value in LinearRegression(C=100)
. That did the trick. C=1000
got the result closest to SPSS
and textbook
result.
Hope this helps anyone who face any problem with LogisticRegression
in python
.
Upvotes: 3
Reputation: 7219
SPSS Logistic regression does not include parameter regularisation in it's cost function, it just does 'raw' logistic regression. In regularisation, the cost function includes a regularisation expression to prevent overfitting. You specify the inverse of this with the C value. If you set C to a very high value, it will closely mimic SPSS, so there is no magic number - just set it as high as you can, and there will be no regularisation.
Upvotes: 2