Charles-Ugo Brouillard
Charles-Ugo Brouillard

Reputation: 221

XOR always converging toward 0.5 using backpropagation in Sigmoidal neural net C++

Thanks again for taking the time to read this post.

I know this question has been asked a lot , and i have checked many posts about this issue : however, my quest for a sucessfull XOR learning using backpropagation remain unfinished.

I tried, like suggested, tweaking the learning rate, the momentum , with/out biases, etc, still no success.

Network consist of 2 input neuron, 2 hidden neurons, 1 output, all Sigmoids. The output neuron seems to always converge around 0.5 for every inputs.

I am therefore requesting your precious skills for that matter. I am using a self-made C++ library (so i can learn in deep how the basics work).

Here are lines-of-interest of my code :

Get error derivative from output neuron

void ClOutputSigmoidNeuron::ComputeErrorGradient()
{
    double wanted_output = this->m_dataset->GetNextData();
    double delta = wanted_output - this->m_result_buffer;
    this->m_error_gradient = delta * this->SigmoidDerivative(this->m_result_buffer);
}

Get error derivative from hidden neuron

void ClSigmoidNeuron::ComputeErrorGradient()
{
    double tmpBuffer = 0.00;
    for(std::size_t i=0;i<this->m_output_connections.size();i++)
    {
        ClNeuron* target_neuron = (ClNeuron*)m_output_connections[i]->m_target_neuron;
        tmpBuffer += (target_neuron->m_error_gradient * this->m_output_connections[i]->m_weight);
    }

    //Get the sigmoid derivative
    this->m_error_gradient = tmpBuffer * this->SigmoidDerivative(this->m_result_buffer);
}

Weights update for a general neuron :

void ClNeuron::UpdateWeights()
{ 
    for(std::size_t i=0;i<this->m_input_connections.size();i++)
    {
        double momentum = this->m_input_connections[i]->m_weight_last_delta * this->m_input_connections[i]->m_momentum_value;
        double new_weight_delta = this->m_learning_rate * this->m_error_gradient * this->m_input_connections[i]->m_data + momentum ;
        this->m_input_connections[i]->m_weight += new_weight_delta;
        this->m_input_connections[i]->m_weight_last_delta = new_weight_delta;
        this->m_input_connections[i]->m_number_of_time_updated++;
    }
}

Transfer functions

double ClNeuron::Sigmoid(double p_value)
{
    return 1.00 / (1.00 + std::exp(p_value*-1.00));
}


double ClNeuron::SigmoidDerivative(double p_value)
{
    double sigmoid = this->Sigmoid(p_value);
    return sigmoid * (1.00 - sigmoid);
}

The function used to training

bool ClBackPropagationSupervisedTrainer::Train()
{
    for (std::size_t i = 0; i < this->m_dataset_size; i++)
    {
        this->m_network->Fire();

        if (!this->m_network->ComputeErrorGradients())
        {
            std::cout << "ClBackPropagationSupervisedTrainer:Train - Oups" << std::endl;
            return false;
        }

        this->m_network->UpdateWeights();
    }

    return true;
}

Again, thanks for reading this, i know this question has been asked a lot ! Pointing me in the right direction would be greatly appreciated.

Upvotes: 2

Views: 489

Answers (1)

Charles-Ugo Brouillard
Charles-Ugo Brouillard

Reputation: 221

Interestingly enough, in case it can help someone, changing from a Sigmoid() network to a TanH() network solved the issue.

In some way it does make sense, and yet, a Sigmoid transfert function seems perfect for this kind of problem, since XOR is already normalized between 0 & 1...

Upvotes: 1

Related Questions