Gauss
Gauss

Reputation: 499

about tensorflow backpropagation

If I have two neural networks A and B, and I use the output of network A to feed the input(placeholder) of network B. And I use the optimizer to minimize the loss of network B, can the network A's parameters be updated by backpropagation?

Upvotes: 1

Views: 643

Answers (1)

rdadolf
rdadolf

Reputation: 1248

Yes, if "feed" is done in TensorFlow; no, if you do it manually.

Specifically, if you evaluate A, then train B with those outputs manually fed in (say, as a feed dict), A will not change, because it is not involved in the training stage.

If you set the input of the B network to be the output of an op in A (instead of a tf.Placeholder, for instance), then you can train the combined network, which will update A's parameters. In this case, though, you're really just training a combined network "AB", not two separate networks.

A concrete example:

import numpy as np
import tensorflow as tf

# A network
A_input = tf.placeholder(tf.float32, [None,100])
A_weights = tf.Variable(tf.random_normal([100,10]))
A_output = tf.matmul(A_input,A_weights)

# B network
B_input = tf.placeholder(tf.float32, [None,10])
B_weights = tf.Variable(tf.random_normal([10,5]))
B_output = tf.matmul(B_input,B_weights)

# AB network
AB_input = A_output
AB_weights = tf.Variable(tf.random_normal([10,5]))
AB_output = tf.matmul(AB_input,AB_weights)

test_inputs = np.random.rand(17,100)
sess = tf.Session()
sess.run(tf.global_variables_initializer())
A_out = sess.run(A_output, feed_dict={A_input: test_inputs})
print 'A output shape:',A_out.shape
B_out = sess.run(B_output, feed_dict={B_input: A_out})
print 'B output shape:',B_out.shape

AB_out = sess.run(AB_output, feed_dict={A_input: test_inputs})
print 'AB output shape:',AB_out.shape

In the first case, we've fed network B with the outputs from network A by using a feed_dict. This is evaluating network A in tensorflow, pulling the results back into python, then evaluating network B in tensorflow. If you try to train network B in this fashion, you'll only update parameters in network B.

In the second case, we've fed the "B" part of network AB by directly connecting the outputs of network A to the input of network AB. Evaluating network AB never pulls the intermediate results of network A back into python, so if you train network AB in this fashion, you can update the parameters for the combined network. (note: your training inputs are feed to the A_input of network AB, not the intermediate tensor AB_input)

Upvotes: 2

Related Questions