Reputation: 1

sklearn.gaussian_process fit() not working with array sizes greater than 100

I am generating a random.uniform(low=0.0, high=100.0, size=(150,150)) array.
I input this into a function that generates the X, x, and y.

However, if the random test matrix is greater than 100, I get the error below.
I have tried playing around with theta values.

Has anyone had this problem? Is this a bug?
I am using python2.6 and scikit-learn-0.10. Should I try python3?

Any suggestions or comments are welcome.

Thank you.

gp.fit( XKrn, yKrn )
  File "/usr/lib/python2.6/scikit_learn-0.10_git-py2.6-linux-x86_64.egg/sklearn/gaussian_process/gaussian_process.py", line 258, in fit
    raise ValueError("X and y must have the same number of rows.")
ValueError: X and y must have the same number of rows.

Upvotes: 0

Answers (2)

user1978019

Reputation: 3336

My original post was deleted. Thanks, Flexo.

I had the same problem, and number of rows I was passing in was the same in my X and y.

In my case, the problem was in fact that I was passing in a number of features to fit against in my output. Gaussian processes fit to a single output feature.

The "number of rows" error was misleading, and stemmed from the fact that I wasn't using the package correctly. To fit multiple output features like this, you'll need a GP for each feature.

Upvotes: 0

ogrisel

Reputation: 40169

ValueError: X and y must have the same number of rows. means that in your case XKrn.shape[0] should be equal to yKrn.shape[0]. You probably have an error in the code generating the dataset.

Here is a working example:

In [1]: from sklearn.gaussian_process import GaussianProcess

In [2]: import numpy as np

In [3]: X, y = np.random.randn(150, 10), np.random.randn(150)

In [4]: GaussianProcess().fit(X, y)
Out[4]: 
GaussianProcess(beta0=None,
        corr=<function squared_exponential at 0x10d42aaa0>, normalize=True,
        nugget=array(2.220446049250313e-15), optimizer='fmin_cobyla',
        random_start=1,
        random_state=<mtrand.RandomState object at 0x10b4c8360>,
        regr=<function constant at 0x10d42a488>, storage_mode='full',
        theta0=array([[ 0.1]]), thetaL=None, thetaU=None, verbose=False)

Python 3 is not supported yet and the latest released version of scikit-learn is 0.12.1 at this time.

Upvotes: 3

sklearn.gaussian_process fit() not working with array sizes greater than 100

Answers (2)

Related Questions