linear regression with multiple variables in matlab, formula and code do not match

Question

I have the following datasets:

X

X =

1.0000    0.1300   -0.2237
1.0000   -0.5042   -0.2237
1.0000    0.5025   -0.2237
1.0000   -0.7357   -1.5378
1.0000    1.2575    1.0904
1.0000   -0.0197    1.0904
1.0000   -0.5872   -0.2237
1.0000   -0.7219   -0.2237
1.0000   -0.7810   -0.2237
1.0000   -0.6376   -0.2237
1.0000   -0.0764    1.0904
1.0000   -0.0009   -0.2237
1.0000   -0.1393   -0.2237
1.0000    3.1173    2.4045
1.0000   -0.9220   -0.2237
1.0000    0.3766    1.0904
1.0000   -0.8565   -1.5378
1.0000   -0.9622   -0.2237
1.0000    0.7655    1.0904
1.0000    1.2965    1.0904
1.0000   -0.2940   -0.2237
1.0000   -0.1418   -1.5378
1.0000   -0.4992   -0.2237
1.0000   -0.0487    1.0904
1.0000    2.3774   -0.2237
1.0000   -1.1334   -0.2237
1.0000   -0.6829   -0.2237
1.0000    0.6610   -0.2237
1.0000    0.2508   -0.2237
1.0000    0.8007   -0.2237
1.0000   -0.2034   -1.5378
1.0000   -1.2592   -2.8519
1.0000    0.0495    1.0904
1.0000    1.4299   -0.2237
1.0000   -0.2387    1.0904
1.0000   -0.7093   -0.2237
1.0000   -0.9584   -0.2237
1.0000    0.1652    1.0904
1.0000    2.7864    1.0904
1.0000    0.2030    1.0904
1.0000   -0.4237   -1.5378
1.0000    0.2986   -0.2237
1.0000    0.7126    1.0904
1.0000   -1.0075   -0.2237
1.0000   -1.4454   -1.5378
1.0000   -0.1871    1.0904
1.0000   -1.0037   -0.2237

theta

0
0
0

y

The X set represents values for multiple variable regression, the first colum stands for X0, second X1; and so on.

The implementation formula is something like:

enter image description here

I have implemented a matlab code which is:

for i=1:size(theta,1)
    h=X*theta;
    sumE=sum((h-y).*X(:,i));
    theta(i)=theta(i)-alpha*(1/m)*sumE;
end

which is inside a for loop going from 1 to a n number of iterations (the value of m is not relevant, it can be set up to 40 for example). The problem is that even though the code works and the result is the one expected, when I submit it to a online checking program it appears that my results are wrong. The reason is that I should update theta simultaneously.

I have gotten the following Matlab code from Internet:

h = X*theta;
theta = theta - alpha / m * (X'*(h - y));

when I run the Internet solution it gives me almost the same answer as mine, with only a subtle difference in the 6th decimal position. When I submit that answer to the online program it is fully accepted, but I was wondering where the summation goes? in the formula explicitly indicates a summation which is no longer in the Internet solution. Maybe both codes are fine, but I do not know if the Internet author has made some linear algebra trick. Any help?

Thanks

DanielTheRocketMan · Accepted Answer

I am not sure if I understood your question, but the formula you copied from internet is X'(h-y). Note that there is a tranposition signal after X! So, this is a matrices product. Your sum (your loop) is replaced by this matrices product.

linear regression with multiple variables in matlab, formula and code do not match

Answers (2)

something you should try

Related Questions