Reputation: 1815
The problem is very simple. Fitting a quartic curve to some noisy data. Mldivide gives the wrong set of values for the coefficients, while direct computation of pseudo inverse works.
Quartic equation: c0 + c1*t + c2*t^2 + c3*t^3 + c4*t^4
Input file contains t and actual value for each sample.
fid = fopen('Data_Corr.txt');
A = zeros(4001,5);
for i = 1:4001
dataPt = fscanf(fid,'%f',2);
A(i,:) = [1 dataPt(1) dataPt(1)^2 dataPt(1)^3 dataPt(1)^4];
b(i) = dataPt(2);
end
%c = b\A; %using matlab mldivide
c = inv(A'*A)*A'*b; %computing pseudo inverse directly
for i = 1:4001
d(i) = A(i,1)*c(1) + A(i,2)*c(2) + A(i,3)*c(3) + A(i,4)*c(4) + A(i,5)*c(5);
end
figure; hold on; grid on;
plot(b,'-b');
plot(d,'r-');
Upvotes: 1
Views: 2015
Reputation: 8477
If I interpret your formula and your code right, A
stands in for the powers of t and b
contains the data you are trying to model. Using c
for the coefficient vector, the equation you want to solve would therefore be
b = A * c
Solving this using mldivide
leads to
c = A \ b
while solving it using pinv
results in
c = pinv(A) * b
Your code contains a line corresponding to the second of these two equations, in the form c = inv(A'*A)*A'*b
. Your use of the backslash operator, however, is the wrong way around.
I strongly suspect that your mismatch comes from this error.
If after correcting it the mismatch still exists, the second possibility is that your linear system is not sufficiently determined by the data. In that case, using pinv
you get the solution with the minimum sum of squares of the coefficients, sum(c .^ 2)
, while mldivide
produces a solution with zero coefficients, a "sparse" solution. How exactly this solution is arrived at depends, as @Dmitry pointed out, on the precise properties of the matrices involved. An example:
A = [1 2 0; 0 4 3];
b = [8; 18];
c_mldivide = A \ b
c_pinv = pinv(A) * b
gives the output
c_mldivide =
0
4
0.66666666666667
c_pinv =
0.918032786885245
3.54098360655738
1.27868852459016
Because the 3 coefficients of the system are constrained by only 2 data points, one of them can be effectively freely chosen.
In your case I doubt that this is the case because a linear system with 5 coefficients should be well determined by 4001 data points, unless the data are highly redundant.
Upvotes: 4