user2675516
user2675516

Reputation:

Update rule of Perceptron Learning Algorithm

When reading about perceptron update rule, I came across two different formulas.

$1. w(t+1) = w(t) + y(t)x(t) (Yasher's Learning from Data)$

$2. w(t+1) = w(t) + \alpha(d-y(t))x(t)$

Why is there 2 different forms?

And I don't quite understand why the rule works? How can I prove that it works?

Upvotes: 1

Views: 1402

Answers (1)

runDOSrun
runDOSrun

Reputation: 10985

Equation $1 is a mathematical formulation of Hebb's Rule (usually you factor in a learning rate as in your 2nd equation, though). It can be interpreted as "if two neurons fire at the same time, increase their weight". It's the earliest and most simple learning rule for neural networks.

This rule is not ideal for training, e.g. if the input vector x or the target vector y is binary, the update becomes 0 and you're not training anymore.

To address neurons that are connected to each other but do not fire together, this rule was then improved into equation $2, the delta rule. Now, this rule in fact is a special case of the more general Backpropagation algorithm which is used for networks with multiple layers.

You can read up on the "proofs" on the linked pages (it wouldn't make sense to copy/paste it here). Some things like the Hebb rule just require a moment of thought instead of actual proofs (try calculating it with some training data on a piece of paper and you'll understand what it does and what it does not).

I'd actually recommend reading the more complicated model (multi-layer-perceptron/backpropagation) first, since it's much more relevant (single-layer perceptrons are limited to linearly separable data only, so they can't learn e.g. XOR) and if you understood it, you get the single-layer perceptron "for free".

Upvotes: 1

Related Questions