Delta rule neural network teaching. Algorithm explanation necessary

Question

I'm doing a research, a project on neural networks. Just for myself. Earlier I've managed to understand a Backpropagation teaching algorithm, its basics, not the whole story, of course. But lots of resources refer to the delta rule, which is a bit special. I've already managed to understand that weights here are modified one by one. But there are a lot of questions. Could you explain me how does it work, but in more approachable way than it's on wikipedia. Just the algorithm, but with a clear explanation of steps and 'how it works'.

By the way, there are derivatives used for teaching. Can't understand why. And yes, no special source code is necessary unless it'll help.

comingstorm · Accepted Answer

The overall idea is to treat the neural net as a function of the weights w_ij, instead of the inputs: the goal is to minimize the error between the actual outputs and the target outputs in your training data. For each (input/output) training pair, the delta rule determines the direction you need to adjust w_ij to reduce the error for that training pair. By taking short steps for each training pair, you find a direction which is best for the entire training corpus.

Imagine you are in the middle of a huge, mountainous ski resort which is too complicated to understand all at once -- but if your job is to make it to the bottom, all you need to do is head downhill from where you're standing. This is called the gradient descent method: find the steepest way down the slope from where you are, and take a step in that direction. Enough steps will see you at the bottom; for a neural net, the "bottom" is a neural net that is a best fit for your training data.

This is why you need the derivative: the derivative is the slope, and it turns out it's easy to compute -- that's your delta rule. Derivatives are used for teaching because that's how they got the rule in the first place.

For a step-by-step derivation of the delta rule, I'm afraid I can't improve on the wikipedia article you refer to.

Delta rule neural network teaching. Algorithm explanation necessary

Answers (2)

Related Questions