Proper way to implement biases in Neural Networks

I can make a neural network, I just need a clarification on bias implementation. Which way is better: Implement the Bias matrices B1, B2, .. Bn for each layer in their own, seperate matrix from the weight matrix, or, include the biases in the weight matrix by adding a 1 to the previous layer output (input for this layer). In images, I am asking whether this implementation:

Or this implementation:

Is the best. Thank you

Upvotes: 11

Answers (3)

WristMan

Reputation: 347

I think the best way is to have two separate matrices, one for the weitghts and one for the bias. Why? :

I don't believe there is an increase on the computational load since W*x and W*x + b should be equivalent running on GPU. Mathematically and computationally they are equivalent.
Greater modularity. Let's say you want to initialize the weights and the bias using different initializers (ones, zeros, glorot...). By having two separate matrices this is straightforward.
Easier to read and maintain.

Upvotes: 1

Suleka_28

Reputation: 2919

In my opinion I think implementing the bias matrices separately for each layer is the way to go. This will create a lot of hyper-parameters that your model will have to learn but it will give your model more freedom to converge.

For more information read this.

Upvotes: 0

Basj

Reputation: 46353

include the biases in the weight matrix by adding a 1 to the previous layer output (input for this layer)

This seems to be what is implemented here: Machine Learning with Python: Training and Testing the Neural Network with MNIST data set in the paragraph "Networks with multiple hidden layers".

I don't know if it's the best way to do it though. (Maybe not related but still: in the mentioned example code, it worked with sigmoid, but failed when I replaced it with ReLU).

Upvotes: 0

Proper way to implement biases in Neural Networks

Answers (3)

Related Questions