Reputation: 3053
I am trying to print out the weights of my network before and after training by using this code:
weights = [v for v in tf.trainable_variables() if v.name == 'dense1/kernel:0'][0]
print(sess.run(weights))
However, the values doesn't change at all.
When I try to debug it by printing out the accuracy along side with the weights, I can see that the accuracy is improving, but the weight remains the same. The output while training is like:
weights = [-0.07634658 -0.03764156] acc = 0.1551000028848648
weights = [-0.07634658 -0.03764156] acc = 0.4083999991416931
weights = [-0.07634658 -0.03764156] acc = 0.4812999963760376
weights = [-0.07634658 -0.03764156] acc = 0.3167000114917755
weights = [-0.07634658 -0.03764156] acc = 0.49880000948905945
weights = [-0.07634658 -0.03764156] acc = 0.42320001125335693
weights = [-0.07634658 -0.03764156] acc = 0.4494999945163727
weights = [-0.07634658 -0.03764156] acc = 0.578000009059906
weights = [-0.07634658 -0.03764156] acc = 0.6047999858856201
Is it a bug? Or am I printing the weight correctly?
Attached below is a simple model I try to debug
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
import os
os.environ['CUDA_VISIBLE_DEVICES']='3'
config = tf.ConfigProto()
config.gpu_options.per_process_gpu_memory_fraction = 0.1
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
X = tf.placeholder(tf.float32, [None, 784])
y_true = tf.placeholder(tf.int32, [None, 10])
layer1 = tf.layers.dense(X, 2, name='dense1')
output = tf.layers.dense(layer1, 10, name='dense2')
cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(y_true, output))
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.5)
train = optimizer.minimize(cross_entropy)
init = tf.global_variables_initializer()
sess = tf.Session(config=config)
sess.run(init)
weights = [v for v in tf.trainable_variables() if v.name == 'dense1/kernel:0'][0]
print("weights before training",sess.run(weights))
for step in range(1000):
batch_x, batch_y = mnist.train.next_batch(100)
sess.run(train, feed_dict={X:batch_x, y_true:batch_y})
if step % 50 ==0:
weights = [v for v in tf.trainable_variables() if v.name == 'dense1/kernel:0'][0]
print("weights = ", sess.run(weights[0]))
correct_prediction = tf.equal(tf.argmax(output,1), tf.argmax(y_true,1))
acc = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print("acc = ",sess.run(acc, feed_dict={X:mnist.test.images, y_true:mnist.test.labels}))
weights_after = [v for v in tf.trainable_variables() if v.name == 'dense1/kernel:0'][0]
print("weights after training",sess.run(weights_after))
Upvotes: 0
Views: 281
Reputation: 731
The weights in 'dense1/kernel' (W1) didn't change, but in 'dense2/kernel' (W2) it changed. W2 updated is the reason for the updated accuracy. That means W1 is not being trained by gradient descent, but W2 did.
Buy the way, add sess.close()
in the end if you don't use with tf.Session() as sess:
Upvotes: 1
Reputation: 191
It seems fine, but it misses the Model building. Model(intput=X, output=output)
Upvotes: 0