How do I use GP.fit in sci-kit learn for a multi-dimensional input?

Question

Is it possible to provide an example? I am trying to use it for a 5D input. Also, how do I plot a chart for each input against the output. I have one output dimension. My idea is to pass some training set data and then validate the output against a testing dataset. I would like to pass a 5d(X1 X2 X3 X4 X5 input where, I have 1600 data points. Right now I only have X1 as my input

Here is the Code:

from matplotlib import pyplot as plt
from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.base import BaseEstimator
from sklearn.gaussian_process.kernels import RBF, Matern, WhiteKernel, ConstantKernel, RationalQuadratic, ExpSineSquared, DotProduct
# define Kernel

import numpy as np
kernels = [1.0 * RBF(length_scale=1.0, length_scale_bounds=(1e-1, 10.0)),
           1.0 * Matern(length_scale=1.0, length_scale_bounds=(1e-1, 10.0),
                        nu=1.5),
           1.0 * RationalQuadratic(length_scale=1.0, alpha=0.1),
           1.0 * ExpSineSquared(length_scale=1.0, periodicity=3.0,
                                length_scale_bounds=(0.1, 10.0),
                                periodicity_bounds=(1.0, 10.0)),
           ConstantKernel(0.1, (0.01, 10.0))
               * (DotProduct(sigma_0=1.0, sigma_0_bounds=(0.0, 10.0)) ** 2),
           ]

# Define inputs and outputs
x = np.array([-5.2,-3,-2,-1,1,5], ndmin=2).T
X = x.reshape(-1, 1)
y =np.array([-2,0,1,2,-1,1])
max_x = max(x)
min_x = min (x)
max_y = max (y)
min_y = min(y)

for fig_index, kernel in enumerate(kernels):
    # call GP regression library and fit inputs to output
    gp = gaussian_process.GaussianProcessRegressor(kernel=kernel)
    gp.fit(X, y)
#     parameter = get_params(deep=True)
#     print(parameter)           

    gp.kernel_
    print(gp.kernel_)
    plt.figure(fig_index, figsize=(10,6))
    plt.subplot(2,1,1)
    x_pred = np.array(np.linspace(-5, 5,50), ndmin=2).T

    # Mark the observations
    plt.plot(X, y, 'ro', label='observations')

    X_test = np.array(np.linspace(max_x+1, min_x-1, 1000),ndmin=2).T
    y_mean, y_std = gp.predict(X_test, return_std=True)
    # Draw a mean function and 95% confidence interval
    plt.plot(X_test, y_mean, 'b-', label='mean function')
    upper_bound = y_mean +y_std
    lower_bound = y_mean - y_std
    plt.fill_between(X_test.ravel(), lower_bound, upper_bound, color = 'k', alpha = 0.2,
                 label='95% confidence interval')

    # plot posterior
    y_sample = gp.sample_y(X_test,4)
    plt.plot(X_test,y_sample,lw=1)
    plt.scatter(X[:,0],y,c='r',s=50,zorder=10,edgecolor=(0,0,0))
    plt.title("Posterior (kernel:%s)
 Log-Likelihood: %3f"
             % (gp.kernel_, gp.log_marginal_likelihood(gp.kernel_.theta)),
              fontsize=14)
    plt.tight_layout()
    plt.show()

desertnaut · Accepted Answer

There is nothing special in using multiple inputs for GP regression, apart maybe that, for the anisotropic case, you must provide explicitly the relevant arguments in the kernel definition.

Here is a simple example for dummy 5D data, as yours, and an isotropic RBF kernel:

from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process.kernels import RBF
from sklearn.datasets import make_regression
import numpy as np

# dummy data:
X, y = make_regression(n_samples=20, n_features=5, n_targets=1)
X.shape
# (20, 5)

kernel = RBF(length_scale=1.0, length_scale_bounds=(1e-1, 10.0))
gp = GaussianProcessRegressor(kernel=kernel)
gp.fit(X, y)
# GaussianProcessRegressor(alpha=1e-10, copy_X_train=True,
#             kernel=RBF((length_scale=1), n_restarts_optimizer=0,
#             normalize_y=False, optimizer='fmin_l_bfgs_b',
#             random_state=None)

UPDATE: In the anisotropic case, you should define the different parameters explicitly in the kernel; here is an example definition for the RBF kernel and a 2D variable:

kernel = RBF(length_scale=[1.0, 2.0], length_scale_bounds=[(1e-1, 10.0), (1e-2, 1.0)])

Extend analogously for the 5D case.

How do I use GP.fit in sci-kit learn for a multi-dimensional input?

Answers (1)

Related Questions