Reputation: 2543
So I am trying to write a simple softmax classifier in TensorFlow.
Here is the code:
# Neural network parameters
n_hidden_units = 500
n_classes = 10
# training set placeholders
input_X = tf.placeholder(dtype='float32',shape=(None,X_train.shape[1], X_train.shape[2]),name="input_X")
input_y = tf.placeholder(dtype='int32', shape=(None,), name="input_y")
# hidden layer
dim = X_train.shape[1]*X_train.shape[2] # dimension of each traning data point
flatten_X = tf.reshape(input_X, shape=(-1, dim))
weights_hidden_layer = tf.Variable(initial_value=np.zeros((dim,n_hidden_units)), dtype ='float32')
bias_hidden_layer = tf.Variable(initial_value=np.zeros((1,n_hidden_units)), dtype ='float32')
hidden_layer_output = tf.nn.relu(tf.matmul(flatten_X, weights_hidden_layer) + bias_hidden_layer)
# output layer
weights_output_layer = tf.Variable(initial_value=np.zeros((n_hidden_units,n_classes)), dtype ='float32')
bias_output_layer = tf.Variable(initial_value=np.zeros((1,n_classes)), dtype ='float32')
output_logits = tf.matmul(hidden_layer_output, weights_output_layer) + bias_output_layer
predicted_y = tf.nn.softmax(output_logits)
# loss
one_hot_labels = tf.one_hot(input_y, depth=n_classes, axis = -1)
loss = tf.losses.softmax_cross_entropy(one_hot_labels, output_logits)
# optimizer
optimizer = tf.train.MomentumOptimizer(0.01, 0.5).minimize(
loss, var_list=[weights_hidden_layer, bias_hidden_layer, weights_output_layer, bias_output_layer])
This compiles, and I have checked the shape of all the tensor and it coincides with what I expect.
However, I tried to run the optimizer using the following code:
# running the optimizer
s = tf.InteractiveSession()
s.run(tf.global_variables_initializer())
for i in range(5):
s.run(optimizer, {input_X: X_train, input_y: y_train})
loss_i = s.run(loss, {input_X: X_train, input_y: y_train})
print("loss at iter %i:%.4f" % (i, loss_i))
And the loss kept being the same in all iterations!
I must have messed up something, but I fail to see what.
Any ideas? I also appreciate if somebody leaves comments regarding code style and/or tensorflow tips.
Upvotes: 0
Views: 611
Reputation: 3197
One could visualize the weight histogram using TensorBoard to make it easier. I executed your code for this. A few more lines are needed to set up Tensorboard logging but the histogram summary of weights can be easily added.
Initialized to zeros
weights_hidden_layer = tf.Variable(initial_value=np.zeros((784,n_hidden_units)), dtype ='float32')
tf.summary.histogram("weights_hidden_layer",weights_hidden_layer)
Xavier initialization
initializer = tf.contrib.layers.xavier_initializer()
weights_hidden_layer = tf.Variable(initializer(shape=(784,n_hidden_units)), dtype ='float32')
tf.summary.histogram("weights_hidden_layer",weights_hidden_layer)
Upvotes: 1
Reputation: 1994
You have made a mistake. You are initializing your weights using np.zeros
. Use np.random.normal
. You can choose mean
for this Gaussian Distribution by using number of inputs going to a particular neuron. You can read more about it here.
The reason that you want to initialize with Gaussian Distribution is because you want to break symmetry. If all the weights are initialized by zero
, then you can use backpropogation to see that all the weights will evolved same.
Upvotes: 1