Ryan
Ryan

Reputation: 3709

Use scikit-learn to predict data vector "x" given "y"?

Using Scikit learn, the basic idea (with regression, for example) is to predict some "y" given a data vector "x" after having fit a model. Typical code would look like this (adapted from from here):

from sklearn.svm import SVR
import numpy as np
n_samples, n_features = 10, 5
np.random.seed(0)
y = np.random.randn(n_samples)
X = np.random.randn(n_samples, n_features)
clf = SVR(C=1.0, epsilon=0.2)
clf.fit(X[:-1], y[:-1]) 
prediction = clf.predict(X[-1])
print 'prediction:', prediction[0]
print 'actual:', y[-1]

My question is: Is it possible to fit some model (perhaps not SVR) given "x" and "y", and then predict "x" given "y". In other words, something like this:

clf = someCLF()
clf.fit(x[:-1], y[:-1])
prediction = clf.predict(y[-1])
#where predict would return the data vector that could produce y[-1]

Upvotes: 1

Views: 2474

Answers (2)

Ben Allison
Ben Allison

Reputation: 7394

Not possible in scikit, no.

You're asking about a generative or joint model of x and y. If you fit such a model you can do inference about the distribution p(x, y), or either of the conditional distributions p(x | y) or p(y | x). Naive Bayes is the most popular generative model, but you won't be able to do the kind of inferences above with scikit's version. It will also produce bad estimates for anything other than trivial problems. Fitting good join models is much harder than conditional models of one variable given the rest.

Upvotes: 1

arsenyinfo
arsenyinfo

Reputation: 428

No. There are many vectors (X) that may lead to the same result (Y), not vice versa.

Probably you may think about changing your X and Y if you need to predict the data you used as X in the beginning.

Upvotes: 2

Related Questions