Sklearn: Using pretrained hyperparameters Gaussian Process Regression

Question

Fitting a GPR on my data takes a couple of hours, therefore, I want to reuse my pretrained GausianProcessRegressor

I think I found a workaround for this, it seems to produce the same results, but I wondered whether there is a better solution for this, as this is kind of a hack.

kernel = ConstantKernel(0.25, (1e-3, 1e3)) * RBF(hyper_params_rbf, (1e-3, 1e4)) + WhiteKernel(0.0002, (1e-23, 1e3))
gp = GaussianProcessRegressor(kernel=kernel, n_restarts_optimizer=30)

#normalize the data

train = False

if train:
    print('Fitting')
    gp.fit(X, y)
else:
    gp.kernel_= kernel
    gp.X_train_ = X
    gp.y_train_ = y
    gp._y_train_mean = np.zeros(1) #unuse, as Y is not normalized in Regressor
    # Precompute quantities required for predictions which are independent of actual query points
    K = gp.kernel_(gp.X_train_)
    K[np.diag_indices_from(K)] += gp.alpha
    gp.L_ = cholesky(K, lower=True)
    gp.alpha_ = cho_solve((gp.L_, True), gp.y_train_)  

y_pred, sigma = gp.predict(x,  return_std=True)

arthur · Accepted Answer

You should serialize your GaussianProcessRegressor model using pickle or joblib library.

from sklearn.externals import joblib

if train:
    print('Fitting')
    gp.fit(X, y)
    joblib.dump(gp, 'filename.pkl') 
else:
    gp = joblib.load('filename.pkl')

See the help from scikit-learn here

Sklearn: Using pretrained hyperparameters Gaussian Process Regression

Answers (1)

Related Questions