Jsevillamol
Jsevillamol

Reputation: 2543

Simple softmax classifier in tensorflow

So I am trying to write a simple softmax classifier in TensorFlow.

Here is the code:

# Neural network parameters
n_hidden_units = 500
n_classes = 10

# training set placeholders
input_X = tf.placeholder(dtype='float32',shape=(None,X_train.shape[1], X_train.shape[2]),name="input_X")
input_y = tf.placeholder(dtype='int32', shape=(None,), name="input_y")

# hidden layer
dim = X_train.shape[1]*X_train.shape[2] # dimension of each traning data point
flatten_X = tf.reshape(input_X, shape=(-1, dim))
weights_hidden_layer = tf.Variable(initial_value=np.zeros((dim,n_hidden_units)), dtype ='float32')
bias_hidden_layer = tf.Variable(initial_value=np.zeros((1,n_hidden_units)), dtype ='float32')
hidden_layer_output = tf.nn.relu(tf.matmul(flatten_X, weights_hidden_layer) + bias_hidden_layer)

# output layer
weights_output_layer = tf.Variable(initial_value=np.zeros((n_hidden_units,n_classes)), dtype ='float32')
bias_output_layer = tf.Variable(initial_value=np.zeros((1,n_classes)), dtype ='float32')
output_logits = tf.matmul(hidden_layer_output, weights_output_layer) + bias_output_layer
predicted_y = tf.nn.softmax(output_logits)

# loss
one_hot_labels = tf.one_hot(input_y, depth=n_classes, axis = -1)
loss = tf.losses.softmax_cross_entropy(one_hot_labels, output_logits)

# optimizer
optimizer = tf.train.MomentumOptimizer(0.01, 0.5).minimize(
    loss, var_list=[weights_hidden_layer, bias_hidden_layer, weights_output_layer, bias_output_layer])

This compiles, and I have checked the shape of all the tensor and it coincides with what I expect.

However, I tried to run the optimizer using the following code:

# running the optimizer
s = tf.InteractiveSession()
s.run(tf.global_variables_initializer())
for i in range(5):
    s.run(optimizer, {input_X: X_train, input_y: y_train})
    loss_i = s.run(loss, {input_X: X_train, input_y: y_train})
print("loss at iter %i:%.4f" % (i, loss_i))

And the loss kept being the same in all iterations!

I must have messed up something, but I fail to see what.

Any ideas? I also appreciate if somebody leaves comments regarding code style and/or tensorflow tips.

Upvotes: 0

Views: 611

Answers (2)

Mohan Radhakrishnan
Mohan Radhakrishnan

Reputation: 3197

One could visualize the weight histogram using TensorBoard to make it easier. I executed your code for this. A few more lines are needed to set up Tensorboard logging but the histogram summary of weights can be easily added.

Initialized to zeros

weights_hidden_layer = tf.Variable(initial_value=np.zeros((784,n_hidden_units)), dtype ='float32')
tf.summary.histogram("weights_hidden_layer",weights_hidden_layer)

enter image description here

Xavier initialization

initializer = tf.contrib.layers.xavier_initializer()
weights_hidden_layer = tf.Variable(initializer(shape=(784,n_hidden_units)), dtype ='float32')
tf.summary.histogram("weights_hidden_layer",weights_hidden_layer)

enter image description here

Upvotes: 1

Abhishek Mishra
Abhishek Mishra

Reputation: 1994

You have made a mistake. You are initializing your weights using np.zeros. Use np.random.normal. You can choose mean for this Gaussian Distribution by using number of inputs going to a particular neuron. You can read more about it here.

The reason that you want to initialize with Gaussian Distribution is because you want to break symmetry. If all the weights are initialized by zero, then you can use backpropogation to see that all the weights will evolved same.

Upvotes: 1

Related Questions