Gilfoyle
Gilfoyle

Reputation: 3616

Constant Bias in Convolutional Neural Network

I found this example of a CNN implemented in Tensorflow.

In this example of a CNN the bias is constant (starting from line 59).

58 W1 = tf.Variable(tf.truncated_normal([6, 6, 1, K], stddev=0.1))
59 B1 = tf.Variable(tf.constant(0.1, tf.float32, [K]))
60 W2 = tf.Variable(tf.truncated_normal([5, 5, K, L], stddev=0.1))
61 B2 = tf.Variable(tf.constant(0.1, tf.float32, [L]))
62 W3 = tf.Variable(tf.truncated_normal([4, 4, L, M], stddev=0.1))
63 B3 = tf.Variable(tf.constant(0.1, tf.float32, [M]))

Does that mean that the optimizer does not adjust the bias? If yes, what's the reason for a constant bias? Why is the bias even in the fully connected part of the network constant?

Upvotes: 0

Views: 633

Answers (3)

Aparajuli
Aparajuli

Reputation: 354

B1 = tf.Variable(tf.constant(0.1, tf.float32, [K]))

Here B1(a variable ) is created that is initialized using a constant. The optimizer changes the variables B1, B2, B3 during training. It is just that they were initialized with a constant. Did you see the difference?

In C/C++ this would be similar to:

const float c = 0.1;
float B1 = c;
float W1 = 0.2; // initialized weights
float X= 10;
float out = X*W1 + B1;
std::cout << "ouput =, " << out << " B1 =" <<B1 <<"\n";
// now update bias, weight 
B1 = B1 + B1/10;
W1 = W1 + W1/10;
out = X*W1 + B1;
std::cout << "ouput =, " << out << " B1 =" <<B1 <<"\n";

This is exactly what is happening. Initializing the Variable B1 with constant C does not change the fact that B1 is still a variable. It was just the author decision in the example that you cited.

Upvotes: 1

Abhishek Sehgal
Abhishek Sehgal

Reputation: 598

A bias in a neural network works exactly the same way it does in a linear equation:

y = mx + c

it shifts the output by a value. As the example is using ReLU as the activation, any negative gradient won't propagate through it. Adding a bias allows more gradient values to propagate through.

Usually bias is initialized as zeros and can be set to be trainable or not. But in this example they've initialized it as constant 0.1, which would not be trainable.

Bias initialization is not extremely important in training as compared to the Weight Initialization.

Upvotes: 0

Ignacio Peletier
Ignacio Peletier

Reputation: 2206

The bias will change as the network is trained. It is just inizialitining at that value. Bias is very important in the fully connected networks, it is a value that will always be there not depeding on the input, and networks perform better with it.

Upvotes: 0

Related Questions