Raven Cheuk
Raven Cheuk

Reputation: 3053

Tensorflow weights won't change before and after training

I am trying to print out the weights of my network before and after training by using this code:

weights = [v for v in tf.trainable_variables() if v.name == 'dense1/kernel:0'][0]
print(sess.run(weights))

However, the values doesn't change at all.

When I try to debug it by printing out the accuracy along side with the weights, I can see that the accuracy is improving, but the weight remains the same. The output while training is like:

weights = [-0.07634658 -0.03764156] acc = 0.1551000028848648

weights = [-0.07634658 -0.03764156] acc = 0.4083999991416931

weights = [-0.07634658 -0.03764156] acc = 0.4812999963760376

weights = [-0.07634658 -0.03764156] acc = 0.3167000114917755

weights = [-0.07634658 -0.03764156] acc = 0.49880000948905945

weights = [-0.07634658 -0.03764156] acc = 0.42320001125335693

weights = [-0.07634658 -0.03764156] acc = 0.4494999945163727

weights = [-0.07634658 -0.03764156] acc = 0.578000009059906

weights = [-0.07634658 -0.03764156] acc = 0.6047999858856201

Is it a bug? Or am I printing the weight correctly?

Attached below is a simple model I try to debug

import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
import os

os.environ['CUDA_VISIBLE_DEVICES']='3'

config = tf.ConfigProto()
config.gpu_options.per_process_gpu_memory_fraction = 0.1

mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

X = tf.placeholder(tf.float32, [None, 784])
y_true = tf.placeholder(tf.int32, [None, 10]) 

layer1 = tf.layers.dense(X, 2, name='dense1')
output = tf.layers.dense(layer1, 10, name='dense2')

cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(y_true, output))

optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.5)
train = optimizer.minimize(cross_entropy)

init = tf.global_variables_initializer()

sess = tf.Session(config=config)
sess.run(init)

weights = [v for v in tf.trainable_variables() if v.name == 'dense1/kernel:0'][0]

print("weights before training",sess.run(weights))


for step in range(1000):
    batch_x, batch_y = mnist.train.next_batch(100)
    sess.run(train, feed_dict={X:batch_x, y_true:batch_y})

    if step % 50 ==0:
        weights = [v for v in tf.trainable_variables() if v.name == 'dense1/kernel:0'][0]
        print("weights = ", sess.run(weights[0]))
        correct_prediction = tf.equal(tf.argmax(output,1), tf.argmax(y_true,1))
        acc = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
        print("acc = ",sess.run(acc, feed_dict={X:mnist.test.images, y_true:mnist.test.labels}))




weights_after = [v for v in tf.trainable_variables() if v.name == 'dense1/kernel:0'][0]

print("weights after training",sess.run(weights_after))

Upvotes: 0

Views: 281

Answers (2)

qinlong
qinlong

Reputation: 731

The weights in 'dense1/kernel' (W1) didn't change, but in 'dense2/kernel' (W2) it changed. W2 updated is the reason for the updated accuracy. That means W1 is not being trained by gradient descent, but W2 did. Buy the way, add sess.close() in the end if you don't use with tf.Session() as sess:

Upvotes: 1

Simon Caby
Simon Caby

Reputation: 191

It seems fine, but it misses the Model building. Model(intput=X, output=output)

Upvotes: 0

Related Questions