Reputation: 11
I want to do LDA classification on my data. My data has 6 features and I want to find out which one has the best classification performance. So My idea is to evaluate all the features separately and each time I only fit one feature into the LDA classifier in matlab function fitcdiscr
.
My question is how can I visualize the output of the classification like the figure shown below:
After using the function fitcdiscr
, I have a model and how can I visualize the line in the figure which seperates the two classes? should fit the feature together with the number of recordings to function fitcdiscr
?
Thank you very much!
Upvotes: 1
Views: 5040
Reputation: 14316
Here is some sample data:
x = [1, 1.5] .* randn(100, 2);
x(51:end, :) = [1, 2] .* x(51:end, :) + [2, 4];
y = [ones(50, 1); 2*ones(50, 1)];
If you fit an LDA model with
mdl = fitcdiscr(x, y);
this returns an ClassificationDiscriminant
object, which contains the field Coeffs
, where all LDA coefficients are stored. This is a k-by-k structure, where k is the number of classes, i.e. here you have a 2-by-2 structure. Coeffs(i, j)
contains the linear boundary between the classes i
and j
. Thus, we are only interested in Coeffs(1, 2)
, i.e. the boundary between classes 1 and 2.
As described in the docs, the equation of the boundary between two classes is (simplified to ignore the quadratic part, as we are dealing with LDA and not QDA)
Const + Linear * x = 0,
Thus, we can calculate the function of the line with
x(2) = -(Const + Linear(1) * x(1)) / Linear(2)
We can create a scatter plot with gscatter
, and add the line by finding the minimal and maximal x-Values of the current axis (gca
) and calculating the corresponding y-Values with the equation above.
figure(1)
gscatter(x(:, 1), x(:, 2), y);
hold on
lx = get(gca, 'Xlim');
ly = -(mdl.Coeffs(1, 2).Const + mdl.Coeffs(1, 2).Linear(1) .* lx) / mdl.Coeffs(1, 2).Linear(2);
plot(lx, ly, '-b', 'DisplayName', 'LDA')
hold off
which results in
Upvotes: 2