Kai
Kai

Reputation: 428

Proper way to implement biases in Neural Networks

I can make a neural network, I just need a clarification on bias implementation. Which way is better: Implement the Bias matrices B1, B2, .. Bn for each layer in their own, seperate matrix from the weight matrix, or, include the biases in the weight matrix by adding a 1 to the previous layer output (input for this layer). In images, I am asking whether this implementation:

enter image description here

Or this implementation:

enter image description here

Is the best. Thank you

Upvotes: 11

Views: 4941

Answers (3)

WristMan
WristMan

Reputation: 347

I think the best way is to have two separate matrices, one for the weitghts and one for the bias. Why? :

  • I don't believe there is an increase on the computational load since W*x and W*x + b should be equivalent running on GPU. Mathematically and computationally they are equivalent.

  • Greater modularity. Let's say you want to initialize the weights and the bias using different initializers (ones, zeros, glorot...). By having two separate matrices this is straightforward.

  • Easier to read and maintain.

Upvotes: 1

Suleka_28
Suleka_28

Reputation: 2919

In my opinion I think implementing the bias matrices separately for each layer is the way to go. This will create a lot of hyper-parameters that your model will have to learn but it will give your model more freedom to converge.

For more information read this.

Upvotes: 0

Basj
Basj

Reputation: 46353

include the biases in the weight matrix by adding a 1 to the previous layer output (input for this layer)

This seems to be what is implemented here: Machine Learning with Python: Training and Testing the Neural Network with MNIST data set in the paragraph "Networks with multiple hidden layers".

I don't know if it's the best way to do it though. (Maybe not related but still: in the mentioned example code, it worked with sigmoid, but failed when I replaced it with ReLU).

Upvotes: 0

Related Questions