How obtain the intercept of the PLS-Regression (sklearn)

Question

The PLS regression using sklearn gives very poor prediction results. When I get the model I can not find the way to find the "intercept". Perhaps this affects the prediction of the model? The matrix of scores and loadings are fine. The arrangement of the coefficients also. In any case, how do I get the intercept using the attributes already obtained?

This code throws the coefficients of the variables.

from pandas import DataFrame
from sklearn.cross_decomposition import PLSRegression

X = DataFrame( {
        'x1': [0.0,1.0,2.0,2.0],
        'x2': [0.0,0.0,2.0,5.0],
        'x3': [1.0,0.0,2.0,4.0],
    }, columns = ['x1', 'x2', 'x3'] )
Y = DataFrame({
        'y': [ -0.2, 1.1, 5.9, 12.3 ],
    }, columns = ['y'] )

def regPLS1(X,Y):
    _COMPS_ = len(X.columns) # all latent variables
    model = PLSRegression(_COMPS_).fit( X, Y )
    return model.coef_

The result is:

regPLS1(X,Y)
>>> array([[ 0.84], [ 2.44], [-0.46]])

In addition to these coefficients, the value of the intercept is: 0.26. What am I doing wrong?

EDIT The correct predict(evaluate) response is Y_hat (exactly the same the observed Y):

Y_hat = [-0.2  1.1  5.9 12.3]

Adam · Accepted Answer

To calculate the intercept use the following:

plsModel = PLSRegression(_COMPS_).fit( X, Y )

y_intercept = plsModel.y_mean_ - numpy.dot(plsModel.x_mean_ , plsModel.coef_)

I got the formula directly from the R "pls" package:

 BInt[1,,i] <- object$Ymeans - object$Xmeans %*% B[,,i]

I tested the results and calculated the same intercepts in R 'pls' and scikit-learn.

How obtain the intercept of the PLS-Regression (sklearn)

Answers (2)

Related Questions