Pusheen_the_dev
Pusheen_the_dev

Reputation: 2207

Tensorflow MNIST (Weight and bias variables)

I'm learning how to use Tensorflow with the MNIST tutorial, but I'm blocking on a point of the tutorial.

Here is the code provided :

from tensorflow.examples.tutorials.mnist import input_data
import tensorflow as tf

mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
x = tf.placeholder(tf.float32, [None, 784])
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))
y = tf.nn.softmax(tf.matmul(x, W) + b)
y_ = tf.placeholder(tf.float32, [None, 10])

cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1]))

train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)
saver = tf.train.Saver()
init = tf.initialize_all_variables()
sess = tf.Session()
sess.run(init)
for i in range(1000):
  batch_xs, batch_ys = mnist.train.next_batch(100)
  sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})

But I actually don't understand at all how the variables "W" (The weight) and "b" (The bias) are changed while the computing ? On each batch, they are initialized at zero, but after ? I don't see at all where in the code they're going to change ?

Thanks you very much in advance!

Upvotes: 3

Views: 3460

Answers (1)

mrry
mrry

Reputation: 126184

TensorFlow variables maintain their state from one run() call to the next. In your program they will be initialized to zero, and then progressively updated in the training loop.

The code that changes the values of the variables is created, implicitly, by this line:

train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)

In TensorFlow, a tf.train.Optimizer is a class that creates operations for updating variables, typically based on the gradient of some tensor (e.g. a loss) with respect to those variables. By default, when you call Optimizer.minimize(), TensorFlow creates operations to update all variables on which the given tensor (in this case cross_entropy) depends.

When you call sess.run(train_step), this runs a graph that includes those update operations, and therefore instructs TensorFlow to update the values of the variables.

Upvotes: 11

Related Questions