Reputation: 405
I wanted to understand how regression works by implementing it in octave. To check if my function is correct I compare its result with builtin function polyfit
. Here is my code:
p = [0, 1.5 * pi];
function y = f(x)
y = (exp(-4 * sin(4*x)));
end
function c = regress1(x, y, degree)
L = @(j)(arrayfun( @(k)(j^k), (0:degree)));
x_data = [];
for i = x
x_data = [x_data; L(i)];
end
mx = x_data;
my = y';
c = fliplr((pinv(mx' * mx) * mx' * my)');
end
% number of known points
N = 50;
% polynomial degree
D = 20;
% known points
X = linspace(p(1), p(2), N);
Y = arrayfun(@f, X);
% my function
R = regress1(X, Y, D);
% reference function
C = polyfit(X, Y, D);
% test points
W = linspace(p(1), p(2), 1000);
V1 = arrayfun(@(x)(polyval(R,x)), W);
Z = arrayfun(@(x)(polyval(C,x)), W);
clf;
hold on;
fplot(@f, p);
plot(X, Y, '*');
plot(W, V1, '1-');
plot(W, Z, '2-');
hold off;
The problem is that it works for small degree (variable D
in my program) but for bigger degrees it producecs far too small coefficients. Plot of orginal function and two approximations:
orginal - blue, my - red, polyfit - green
I suspect that something in this part (pinv(mx' * mx) * mx' * my)'
may be wrong. Also, if i change from pinv
to inv
I get different result, but even worse.
I did some research, found some other solutions e.g , but even though those programs are similar, I can't find bugs in mine.
Could someone explain me what is wrong, please?
Upvotes: 1
Views: 1867
Reputation: 393
Your matrix becomes ill-conditioned for large D (degree of the approximation polynomial). You can try to use different kind of polynomial base to do the regression. For e.g. used Chebyshev polynomials instead of x, x^2, x^3, ... , x^D.
You can obtain a slightly better result if you center and scale your data. Basically if you try to fit a function to a set (x, y), you can redefine the problem using:
xn = (x - mean(x))/std(x)
after that fit the set on (xn, y).
Upvotes: 1