Naïve Bayes Classifier -- is normalization necessary?

Question

We recently studied the Naïve Bayesian Classifier in our Machine Learning class and now I'm trying to implement it on the Fisher Iris dataset as a self-exercise. The concept is easy and straightforward, with some trickiness involved for continuous attributes. I read up several literature resources which recommended using a Gaussian approximation to compute probability of test data values, so I'm going with it in my code.

Now I'm trying to run it initially for 50% training and 50% test data samples, but something is missing. The current code is always predicting class 1 (I used integers to represent the classes) for all test samples, which is obviously wrong.

My guess is that the problem may be due to normalization being omitted by the code? Though I think adding normalization would still yield proportionate results, and so far my attempts to normalize have produced the same classification results.

Can someone please suggest if there is anything obvious missing here? Or if I'm not approaching this right? Since most of the code is 'mechanics', I have made prominent (****************) the 2 lines that are responsible for the calculations. Any help is appreciated, thanks!

nsamples=75;                                      % 50% samples
% acquire training set and test set
[trainingSample,idx] = datasample(data,nsamples,'Replace',false);
testData = data(setdiff(1:150,idx),:);

% define Gaussian function
%***********************************************************%
Phi=@(mu,sig2,x) (1/sqrt(2*pi*sig2))*exp(-((x-mu)^2)/2*sig2);
%***********************************************************%
   
for c=1:3                                         % for 3 classes in training set
    clear y x mu sig2;
    index=1;
    for i=1 : length(trainingSample)
        if trainingSample(i,5)==c
            y(index,:)=trainingSample(i,:);       % filter current class samples
            index=index+1;                        % for conditional probabilities
        end
    end
    
    for j=1:size(testData,1)                      % iterate over test samples
        clear pf p;
        for i=1:4                                 % iterate over columns
            x=testData(j,i);                      % representing attributes
            mu=mean(y(:,i));
            sig2=var(y(:,i));
            pf(i) = Phi(mu,sig2,x);               % calc conditional probability
        end
        
        % calc class likelihood; prior * posterior
        %*****************************************************%
        pc(j,c) = size(y,1)/nsamples * pf(1)*pf(2)*pf(3)*pf(4);
        %*****************************************************%
    end
end

% find the predicted class for each test sample
% by taking the max probability calculated    
for i=1:size(pc,1)
    [~,q]=max(pc(i,:));
    predicted(i)=q;
    actual(i)=testData(i,5);
end

Naïve Bayes Classifier -- is normalization necessary?

Answers (1)

Related Questions

Na&#239;ve Bayes Classifier -- is normalization necessary?

Answers (1)

Related Questions

Naïve Bayes Classifier -- is normalization necessary?