Reputation: 1
The coder implementing autoencoder is shown as following:
# One Layer Autoencoder
# Parameters
learning_rate = 0.01
training_epochs = 20
batch_size = 256
display_step = 1
examples_to_show = 10
# Network Parameters
n_hidden= 128 # 1st layer num features
n_input = 784 # MNIST data input (img shape: 28*28)
# tf Graph input (only pictures)
X = tf.placeholder("float", [None, n_input])
weights = {
'encoder_h': tf.Variable(tf.random_normal([n_input, n_hidden])),
'decoder_h': tf.Variable(tf.random_normal([n_hidden, n_input])),
}
biases = {
'encoder_b': tf.Variable(tf.random_normal([n_hidden])),
'decoder_b': tf.Variable(tf.random_normal([n_input])),
}
# Building the encoder
hidden_layer = tf.nn.sigmoid(tf.add(tf.matmul(X,weights["encoder_h"]),biases["encoder_b"]))
out_layer = tf.nn.sigmoid(tf.add(tf.matmul(hidden_layer,weights["decoder_h"]),biases["decoder_b"]))
# Prediction
y_pred = out_layer
# Targets (Labels) are the input data.
y_true = X
# Define loss and optimizer, minimize the squared error
cost = tf.reduce_mean(tf.pow(y_true - y_pred, 2))
optimizer = tf.train.RMSPropOptimizer(learning_rate).minimize(cost)
# initializing the variables
init = tf.initialize_all_variables()
with tf.device("/gpu:0"):
with tf.Session(config=config) as sess:
sess.run(init)
total_batch = int(mnist.train.num_examples/batch_size)
print([total_batch,batch_size,mnist.train.num_examples])
for epoch in range(training_epochs):#each round
for i in range(total_batch):
batch_xs, batch_ys = mnist.train.next_batch(batch_size)
# Run optimization op (backprop) and cost op (to get loss value)
_, loss_c = sess.run([optimizer, cost], feed_dict={X: batch_xs})
if epoch % display_step == 0:
encoder_w = weights["encoder_h"]
encoder_w_eval = encoder_w.eval()
print(encoder_w_eval[0,0])
decoder_w = weights["decoder_h"]
decoder_w_eval = decoder_w.eval()
print(decoder_w_eval[0,0])
print("Epoch:","%04d"%(epoch+1),
"cost=","{:.9f}".format(loss_c))
print("Optimization Finished!")
When I print the encoder, decoder weight and the loss. The decoder and loss weight changes when training but the encoder weight remain the same as shown as following and I don't know why. Somebody help.
encoder_w -0.00818192
decoder_w -1.48731
Epoch: 0001 cost= 0.132702485
encoder_w -0.00818192
decoder_w -1.4931
Epoch: 0002 cost= 0.089116640
encoder_w -0.00818192
decoder_w -1.49607
Epoch: 0003 cost= 0.080637991
encoder_w -0.00818192
decoder_w -1.49947
Epoch: 0004 cost= 0.073829792
encoder_w -0.00818192
decoder_w -1.50176
...
Upvotes: 0
Views: 622
Reputation: 2312
The weights always behave in that manner. That is, they always have gaussian distribution. Note that your input could follow any distribution in high dimensions. In addition, it seems that if you different types of distribution you end up having a Gaussian distribution (This is from probability theory). As a result, the distribution of the weights will somehow generalize and will continue following a Gaussian distribution. Note also that, with Batch Normalization
the aim is to force the output of the activation functions
to follow a Gaussian distribution. This is the general intuition.
In addition, l2_regulization
is in some way forcing the weights to follow a Gaussian Distribution
.
Finally, printing the weights
as you are doing is wrong. Rather you should use Tensorboard
.
Hope this answer helps.
Upvotes: 1
Reputation: 5206
In general I recommend inspecting the graph on Tensorboard to make sure that it looks like you expect it to (for example, that there are gradient updates for the encoder weights.
In your case it could be that encoder_w[0, 0]
doesn't change much because the its gradients happen to be small, and so is the learning rate.
Upvotes: 0