Reputation: 197
I am new to tensorflow and I am following this tutorial which is doing variational droptout to train a NN https://gist.github.com/VikingPenguinYT/665769ba03115b1a0888893eaf1d4f13
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for i in range(10000):
sess.run(train_step, {model_X: X, model_y: y})
if i % 100 == 0:
mse = sess.run(model_mse, {model_X: X, model_y: y})
print("Iteration {}. Mean squared error: {:.4f}.".format(i, mse))
# Sample from the posterior.
n_post = 1000
Y_post = np.zeros((n_post, X_pred.shape[0]))
for i in range(n_post):
Y_post[i] = sess.run(model_pred, {model_X: X_pred})
My questions are:
(1) I now that the tf.Session is used to train and evaluate the NN. But if outside the sess (e.g. the line begin from # Sample from the posterior), are the parameter values trained by the optimizaer still available? (are they global variables or just local variables inside the sess). Or is it still available because it is using the same sess? If I want to use those parameters to evaluate my objective function, I can always simply call them with sess.run? What if there are several training steps in my code?
(2) This code is using all the training data to make one update is that correct? Can I switch to SGD? (Why the AdmaOptimizer is used here instead of Backprop?) For this variational dropout problem, the dropout is kept for both training and testing sessions. Are AdmaOptimizer and Backprop able to automatically take the dropout into account? (Is seems pretty clever though...) Or will there be problem if I use Backprop?
(3) The training procedure only update the defined variables by tf.variables (M and m in this example) is that correct? What about those intermediate variables? (W in this example).
Thanks!
Upvotes: 3
Views: 1546
Reputation: 32111
(1) is it still available because it is using the same sess?
Yes. Variables are initialized when you create a session. They don't even exist outside of a session. If you trained a network in one session and then want to run inference it's as trivial as sess.run([your_output_op], feed_dict=...)
.
Notice that you have effectively 4 types of tensors (data) in tensorflow:
1) Placeholders - tf.placeholder(...)
- these are tensors that you need to pass into the network using feed_dict. Notice that if you don't use a placeholder in a computation you don't need to pass it in. The obvious example being your labels, which are needed during training but not inference time.
2) Variables - tf.Variable(...)
- these are mutable tensors that stay around between calls to sess.run()
(nothing stays around after you close the session). A common example is the weights of your neural network.
3) OPs (computed tensors) - tf.add(a,b)
- an op (operation) is a computation that is part of your computation graph. An example would be the result of your loss function. These values are not keept around between calls to sess.run
, they are computed dynamically (if needed) during any specific call to sess.run
.
4) Constants - tf.constant(42)
as the name suggests.
(2) The Optimizer is the OP (tensorflow operation) which actually updates the weight variables based on backprop. The optimizer (such as Adam, SGD, Rmsprop, etc) will update all GraphKeys.TRAINABLE_VARIABLES (a default collection of all the trainable variables), however when you get more advanced you can specifically define which variables the optimizer updates (when you encounter such use cases, they're non-trivial use cases). Note that Adam is one of many such optimizers. Simple SGD (stochastic gradient descent) is another such optimizer. Think of Adam as a more refined version of SGD.
(3) Try printing out all your trainable variables and get to see what's there. These are the variables that your optimizer operates on by default: Using tf.trainable_variables() to show names of trainable variables
The TRAINABLE_VARIABLES collection is a standard convention in tensorflow. Most tensorflow operations will know whether they should add variables to the TRAINABLE_VARIABLES by default and do the right thing without you needing to worry about it. Any time you define weights with tf.Variable(...)
the variable you create is added to the TRAINABLE_VARIABLES collection by default.
Upvotes: 2