Reputation: 3616
I found this example of a CNN implemented in Tensorflow.
In this example of a CNN the bias is constant (starting from line 59).
58 W1 = tf.Variable(tf.truncated_normal([6, 6, 1, K], stddev=0.1))
59 B1 = tf.Variable(tf.constant(0.1, tf.float32, [K]))
60 W2 = tf.Variable(tf.truncated_normal([5, 5, K, L], stddev=0.1))
61 B2 = tf.Variable(tf.constant(0.1, tf.float32, [L]))
62 W3 = tf.Variable(tf.truncated_normal([4, 4, L, M], stddev=0.1))
63 B3 = tf.Variable(tf.constant(0.1, tf.float32, [M]))
Does that mean that the optimizer does not adjust the bias? If yes, what's the reason for a constant bias? Why is the bias even in the fully connected part of the network constant?
Upvotes: 0
Views: 633
Reputation: 354
B1 = tf.Variable(tf.constant(0.1, tf.float32, [K]))
Here B1(a variable ) is created that is initialized using a constant. The optimizer changes the variables B1, B2, B3 during training. It is just that they were initialized with a constant. Did you see the difference?
In C/C++ this would be similar to:
const float c = 0.1;
float B1 = c;
float W1 = 0.2; // initialized weights
float X= 10;
float out = X*W1 + B1;
std::cout << "ouput =, " << out << " B1 =" <<B1 <<"\n";
// now update bias, weight
B1 = B1 + B1/10;
W1 = W1 + W1/10;
out = X*W1 + B1;
std::cout << "ouput =, " << out << " B1 =" <<B1 <<"\n";
This is exactly what is happening. Initializing the Variable B1 with constant C does not change the fact that B1 is still a variable. It was just the author decision in the example that you cited.
Upvotes: 1
Reputation: 598
A bias in a neural network works exactly the same way it does in a linear equation:
y = mx + c
it shifts the output by a value. As the example is using ReLU as the activation, any negative gradient won't propagate through it. Adding a bias allows more gradient values to propagate through.
Usually bias is initialized as zeros and can be set to be trainable or not. But in this example they've initialized it as constant 0.1, which would not be trainable.
Bias initialization is not extremely important in training as compared to the Weight Initialization.
Upvotes: 0
Reputation: 2206
The bias will change as the network is trained. It is just inizialitining at that value. Bias is very important in the fully connected networks, it is a value that will always be there not depeding on the input, and networks perform better with it.
Upvotes: 0