user1885116
user1885116

Reputation: 1797

python - linear regression - image

I am trying to wrap my head around on machine learning within python. i have been working with the following example (http://scikit-learn.org/stable/auto_examples/plot_multioutput_face_completion.html#example-plot-multioutput-face-completion-py) with the code example per below.

i would love to test / validat my understanding on the inner working of the linear regression. The aim is to predict the lower missing half of a picture by looking at the known upper half of a picture. There were originally 300 64*64 images (4096 pixels). The independent variable X is a 300*2048 matrix (300 pictures, 2048 pixels (upper half of those pictures). The dependent variable is also a 300*2048 matrix (lower half of the pictures). It seems that the coefficient matrix is a 2048*2048 matrix. Am i right in my understanding that:

I might very well be confused by the matrices - so please correct me if i am wrong. many thanks. W

print(__doc__)

import numpy as np
import matplotlib.pyplot as plt

from sklearn.datasets import fetch_olivetti_faces
from sklearn.utils.validation import check_random_state

from sklearn.ensemble import ExtraTreesRegressor
from sklearn.neighbors import KNeighborsRegressor
from sklearn.linear_model import LinearRegression
from sklearn.linear_model import RidgeCV

# Load the faces datasets
data = fetch_olivetti_faces()
targets = data.target

data = data.images.reshape((len(data.images), -1))
train = data[targets < 30]
test = data[targets >= 30]  # Test on independent people

# Test on a subset of people
n_faces = 5
rng = check_random_state(4)
face_ids = rng.randint(test.shape[0], size=(n_faces, ))
test = test[face_ids, :]

n_pixels = data.shape[1]
X_train = train[:, :np.ceil(0.5 * n_pixels)]  # Upper half of the faces
y_train = train[:, np.floor(0.5 * n_pixels):]  # Lower half of the faces
X_test = test[:, :np.ceil(0.5 * n_pixels)]
y_test = test[:, np.floor(0.5 * n_pixels):]

# Fit estimators
ESTIMATORS = {
    "Extra trees": ExtraTreesRegressor(n_estimators=10, max_features=32,
                                       random_state=0),
    "K-nn": KNeighborsRegressor(),
    "Linear regression": LinearRegression(),
    "Ridge": RidgeCV(),
}

y_test_predict = dict()
for name, estimator in ESTIMATORS.items():
    estimator.fit(X_train, y_train)
    y_test_predict[name] = estimator.predict(X_test)

Upvotes: 3

Views: 6780

Answers (1)

eqzx
eqzx

Reputation: 5599

You're right.

There are 4096 pixels in each image. Each output pixel in the test set is a linear combination of the training coefficients for that pixel, and the 2048 input pixels from the test set.

If you look at the sklearn Linear Regression documentation, you'll see that the coefficients of multi-target regression in are of the shape (n_targets, n_features) (2048 targets, 2048 features)

In [24]: ESTIMATORS['Linear regression'].coef_.shape
Out[24]: (2048, 2048)

Under the hood, it's calling scipy.linalg.lstsq, so it's important to note that there's no "information sharing" between the coefficients, in the sense that each output is a separate linear combination of all 2048 of the input pixels.

Upvotes: 1

Related Questions