d-roy
d-roy

Reputation: 759

Understanding Variable scope example in Tensorflow

I was looking at the mechanics section for Tensorflow, specifically on shared variables. In the section "The problem", they are dealing with a convolutional neural net, and provide the following code (which runs an image through the model):

# First call creates one set of variables.
result1 = my_image_filter(image1)
# Another set is created in the second call.
result2 = my_image_filter(image2)

If the model was implemented in such a way, would it then be impossible to learn/update the parameters because there's a new set of parameters for each image in my training set?

Edit: I've also tried "the problem" approach on a simple linear regression example, and there do not appear to be any issues with this method of implementation. Training seems to work as well as can be shown by the last line of the code. So I'm wondering if there is a subtle discrepancy in the tensorflow documentation and what I'm doing. :

import tensorflow as tf
import numpy as np

trX = np.linspace(-1, 1, 101)
trY = 2 * trX + np.random.randn(*trX.shape) * 0.33 # create a y value which is         approximately linear but with some random noise

X = tf.placeholder("float") # create symbolic variables
Y = tf.placeholder("float")


def model(X):
    with tf.variable_scope("param"):
        w = tf.Variable(0.0, name="weights") # create a shared variable (like theano.shared) for the weight matrix

    return tf.mul(X, w) # lr is just X*w so this model line is pretty simple


y_model = model(X)

cost = (tf.pow(Y-y_model, 2)) # use sqr error for cost function

train_op = tf.train.GradientDescentOptimizer(0.01).minimize(cost) # construct an optimizer to minimize cost and fit line to my data

sess = tf.Session()
init = tf.initialize_all_variables() # you need to initialize variables (in this case just variable W)
sess.run(init)

with tf.variable_scope("train"):
    for i in range(100):
        for (x, y) in zip(trX, trY):
        sess.run(train_op, feed_dict={X: x, Y: y})

print sess.run(y_model, feed_dict={X: np.array([1,2,3])})

Upvotes: 5

Views: 15516

Answers (1)

Dima Lituiev
Dima Lituiev

Reputation: 13116

One has to create the variable set only once per whole training (and testing) set. The goal of variable scopes is to allow for modularization of subsets of parameters, such as those belonging to layers (e.g. when architecture of a layer is repeated, the same names can be used within each layer scope).

In your example you create parameters only in the model function. You can print out your variable names to see that it is assigned to the specified scope:

from __future__ import print_function

X = tf.placeholder("float") # create symbolic variables
Y = tf.placeholder("float")
print("X:", X.name)
print("Y:", Y.name)

def model(X):
    with tf.variable_scope("param"):
        w = tf.Variable(0.0, name="weights") # create a shared variable (like theano.shared) for the weight matrix
    print("w:", w.name)
    return tf.mul(X, w) 

The call to sess.run(train_op, feed_dict={X: x, Y: y}) only evaluates the value of train_op given the provided values of X and Y. No new variables (incl. parameters) are created there; therefore, it has no effect. You can make sure the variable names stay the same by again printing them out:

with tf.variable_scope("train"):
    print("X:", X.name)
    print("Y:", Y.name)
    for i in range(100):
        for (x, y) in zip(trX, trY):
            sess.run(train_op, feed_dict={X: x, Y: y})

You will see that variable names stay the same, as they are already initialized.

If you'd like to retrieve a variable using its scope, you need to use get_variable within a tf.variable_scope enclosure:

with tf.variable_scope("param"):
    w = tf.get_variable("weights", [1])
print("w:", w.name)

Upvotes: 10

Related Questions