Reputation: 759
I was looking at the mechanics section for Tensorflow, specifically on shared variables. In the section "The problem", they are dealing with a convolutional neural net, and provide the following code (which runs an image through the model):
# First call creates one set of variables.
result1 = my_image_filter(image1)
# Another set is created in the second call.
result2 = my_image_filter(image2)
If the model was implemented in such a way, would it then be impossible to learn/update the parameters because there's a new set of parameters for each image in my training set?
Edit: I've also tried "the problem" approach on a simple linear regression example, and there do not appear to be any issues with this method of implementation. Training seems to work as well as can be shown by the last line of the code. So I'm wondering if there is a subtle discrepancy in the tensorflow documentation and what I'm doing. :
import tensorflow as tf
import numpy as np
trX = np.linspace(-1, 1, 101)
trY = 2 * trX + np.random.randn(*trX.shape) * 0.33 # create a y value which is approximately linear but with some random noise
X = tf.placeholder("float") # create symbolic variables
Y = tf.placeholder("float")
def model(X):
with tf.variable_scope("param"):
w = tf.Variable(0.0, name="weights") # create a shared variable (like theano.shared) for the weight matrix
return tf.mul(X, w) # lr is just X*w so this model line is pretty simple
y_model = model(X)
cost = (tf.pow(Y-y_model, 2)) # use sqr error for cost function
train_op = tf.train.GradientDescentOptimizer(0.01).minimize(cost) # construct an optimizer to minimize cost and fit line to my data
sess = tf.Session()
init = tf.initialize_all_variables() # you need to initialize variables (in this case just variable W)
sess.run(init)
with tf.variable_scope("train"):
for i in range(100):
for (x, y) in zip(trX, trY):
sess.run(train_op, feed_dict={X: x, Y: y})
print sess.run(y_model, feed_dict={X: np.array([1,2,3])})
Upvotes: 5
Views: 15516
Reputation: 13116
One has to create the variable set only once per whole training (and testing) set. The goal of variable scopes is to allow for modularization of subsets of parameters, such as those belonging to layers (e.g. when architecture of a layer is repeated, the same names can be used within each layer scope).
In your example you create parameters only in the model
function. You can print out your variable names to see that it is assigned to the specified scope:
from __future__ import print_function
X = tf.placeholder("float") # create symbolic variables
Y = tf.placeholder("float")
print("X:", X.name)
print("Y:", Y.name)
def model(X):
with tf.variable_scope("param"):
w = tf.Variable(0.0, name="weights") # create a shared variable (like theano.shared) for the weight matrix
print("w:", w.name)
return tf.mul(X, w)
The call to sess.run(train_op, feed_dict={X: x, Y: y})
only evaluates the value of train_op
given the provided values of X
and Y
. No new variables (incl. parameters) are created there; therefore, it has no effect. You can make sure the variable names stay the same by again printing them out:
with tf.variable_scope("train"):
print("X:", X.name)
print("Y:", Y.name)
for i in range(100):
for (x, y) in zip(trX, trY):
sess.run(train_op, feed_dict={X: x, Y: y})
You will see that variable names stay the same, as they are already initialized.
If you'd like to retrieve a variable using its scope, you need to use get_variable
within a tf.variable_scope
enclosure:
with tf.variable_scope("param"):
w = tf.get_variable("weights", [1])
print("w:", w.name)
Upvotes: 10