In what circumstances might using biases in a neural network not be beneficial?

Question

I am currently looking through Michael Nielsen's ebook Neural Networks and Deep Learning and have run the code found at the end of chapter 1 which trains a neural network to recognize hand-written digits (with a slight modification to make the backpropagation algorithm over a mini-batch matrix-based).

However, having run this code and achieving a classification accuracy of just under 94%, I decided to remove the use of biases from the network. After re-training the modified network, I found no difference in classification accuracy!

NB: The output layer of this network contains ten neurons; if the ith of these neurons has the highest activation then the input is classified as being the digit i.

This got me wondering why it is necessary to use biases in a neural network, rather than just weights, and what differentiates between a task where biases will improve the performance of a network and a task where they will not?

My code can be found here: https://github.com/pipthagoras/neural-network-1

Juan Carlos Ramirez · Accepted Answer

Biases are used to account for the fact that your underlying data might not be centered. It is clearer to see in the case of a linear regression.

If you do a regression without an intercept (or bias), you are forcing the underlying model to pass through the origin, which will result in a poor model if the underlying data is not centered (for example if the true generating process is Y=3000). If, on the other hand, your data is centered or close to centered, then eliminating bias is good, since you won't introduce a term that is, in fact, independent to your predictive variable (it's like selecting a simpler model, which will tend to generalize better PROVIDED that it actually reflects the underlying data).

In what circumstances might using biases in a neural network not be beneficial?

Answers (1)

Related Questions