user6350437
user6350437

Reputation:

Perceptron and Multilayer perceptron with 1 hidden node for solving XOR

I am playing with Torch7 these days.

Today, I implemented Perceptron and Multilayer perceptorn (MLP) for solving XOR.

And as expected, MLP works well on XOR and Perceptron is not.

But I was curious what is the result if the number of hidden nodes is one.

I expected the result of MLP might be the same with Perceptorn because it has only 1 hidden node.

But interestingly, MLP was better then Percentron.

More detail, Perceptron get 0.25 error (as expected) but MLP with 1 hidden node get approximately 0.16 error.

I thought that one hidden node acts as one line in a problem space.

So, if there is only one hidden node, it could be the same with Perceptron.

But this result told me I was wrong.

Now, I want to know why MLP with 1 hidden node is better than Perceptron.

Please teach me why this result happened.

Thank you very much.

The following is the Perceptron code:

-- perceptron

require 'nn'

-- data
data = torch.Tensor({ {0, 0}, {0, 1}, {1, 0}, {1, 1} })
-- target
target = torch.Tensor({ 0, 1, 1, 0 })

-- model
perceptron = nn.Linear(2, 1)
-- loss function
criterion = nn.MSECriterion()

-- training
for i = 1, 10000 do
   -- set gradients to zero
   perceptron:zeroGradParameters()
   -- compute output
   output = perceptron:forward(data)
   -- compute loss
   loss = criterion:forward(output, target)
   -- compute gradients w.r.t. output
   dldo = criterion:backward(output, target)
   -- compute gradients w.r.t. parameters
   perceptron:backward(data,dldo)
   -- gradient descent with learningRate = 0.1
   perceptron:updateParameters(0.1)
   print(loss)
end

And the following is the MLP with 1 hidden node code:

-- multilayer perceptron

require 'nn'

-- data
data = torch.Tensor({ {0, 0}, {0, 1}, {1, 0}, {1, 1} })
-- target
target = torch.Tensor({ 0, 1, 1, 0 })

-- model
multilayer = nn.Sequential()
inputs = 2; outputs = 1; HUs = 1;
multilayer:add(nn.Linear(inputs, HUs))
multilayer:add(nn.Tanh())
multilayer:add(nn.Linear(HUs, outputs))
-- loss function
criterion = nn.MSECriterion()

-- training
for i = 1, 10000 do
   -- set gradients to zero
   multilayer:zeroGradParameters()
   -- compute output
   output = multilayer:forward(data)
   -- compute loss
   loss = criterion:forward(output, target)
   -- compute gradients w.r.t. output
   dldo = criterion:backward(output, target)
   -- compute gradients w.r.t. parameters
   multilayer:backward(data,dldo)
   -- gradient descent with learningRate = 0.1
   multilayer:updateParameters(0.1)
   print(loss)
end

Upvotes: 0

Views: 281

Answers (1)

joaopauloc.souza
joaopauloc.souza

Reputation: 1

This difference in error is probably due to a difference in the learning rate. The number of epochs you used is high enough to find the perfect accuracy. Here's what you should do to fix this: Keep lowering the learning rate in both cases. Turn it to approximately 1e-4.

Upvotes: 0

Related Questions