M.Arıcı
M.Arıcı

Reputation: 47

TensorFlow: calling a graph inside another graph

I need to give the "logits" of one graph (g1) as an input of another graph (g2). Then, I need to get layer outputs of g2 when the input is "logits". After some calculations on layer outputs, I should return a custom loss value to g1.

Here is the first graph:

g1 = tf.Graph() 
with g.as_default():
    X = tf.placeholder(dtype=tf.float32, shape=[...])
    Y = tf.placeholder(dtype=tf.float32, shape=[...])
    ...
    logits = tf.matmul(flatten, W2) + b2

    def custom_loss(logits):
        # get layer output values of g2 on the input "logits" 
        # some calculations on layer outputs
       return loss

    mse = tf.reduce_mean(tf.squared_difference(logits, Y))

    loss = mse + custom_loss(logits)

    step = tf.train.AdamOptimizer(learning_rate=1e-4).minimize(loss)

sess1 = tf.InteractiveSession(graph=g1)
tf.global_variables_initializer().run()

Here is the second graph:

g2 = tf.Graph()
with g2.as_default():
    X = tf.placeholder(dtype=tf.float32, shape=[...])
    Y = tf.placeholder(dtype=tf.float32, shape=[...])
    ...
    loss = ...
    step = ...

sess2 = tf.InteractiveSession(graph=g2)
tf.global_variables_initializer().run()

I am not sure this is something possible to do. First problem is, sessions of these graphs are different. Hence, I could not give "logits" as an input of g2, in the graph g1.

Second problem is g2 takes an array of elements ("X"), but when I feed "logits" to g2, it won't work since it is a tensor. It is possible to convert it to a numpy array using a session, but how can I use a session inside a graph? I am creating the session after I created the graph.

I need your suggestions to solve these problems. Thanks in advance.

Upvotes: 1

Views: 308

Answers (1)

Vlad
Vlad

Reputation: 8605

Consider following example. You have first graph as follows:

import tensorflow as tf

graph1 = tf.Graph()
with graph1.as_default():
    x1 = tf.placeholder(tf.float32, shape=[None, 2])
    y1 = tf.placeholder(tf.int32, shape=[None])

    with tf.name_scope('network'):
        logits1 = tf.layers.dense(x1, units=2)

    train_vars1 = tf.trainable_variables()

And second graph:

graph2 = tf.Graph()
with graph2.as_default():
    x2 = tf.placeholder(tf.float32, shape=[None, 2])
    y2 = tf.placeholder(tf.int32, shape=[None])

    with tf.name_scope('network'):
        logits2 = tf.layers.dense(x2, units=2)

    with tf.name_scope('loss'):
        xentropy2 = tf.nn.sparse_softmax_cross_entropy_with_logits(
            labels=y2, logits=logits2)
        loss_fn2 = tf.reduce_mean(xentropy2)

    with tf.name_scope('optimizer'):
        optimizer2 = tf.train.GradientDescentOptimizer(0.01)
        train_op2 = optimizer2.minimize(loss_fn2)
    train_vars2 = tf.trainable_variables()

Now you want to feed the logits layer's output of the first graph as input into the second graph. We do it by creating two sessions, initializing variables, evaluating the logits layer of the first graph and then feeding the evaluated value as input to a second graph. I'm going to use a toy blobs dataset to illustrate:

from sklearn.datasets import make_blobs

x_train, y_train = make_blobs(n_samples=4,
                              n_features=2,
                              centers=[[1, 1], [-1, -1]],
                              cluster_std=0.5)
sess1 = tf.Session(graph=graph1)
sess2 = tf.Session(graph=graph2)

_ = sess1.run([v.initializer for v in train_vars1])
_ = sess2.run([v.initializer for v in train_vars2])

# feed the logits layer of graph1 as input to graph2
logits1_val = sess1.run(logits1, feed_dict={x1:x_train})
logits2_val = sess2.run(logits2, feed_dict={x2:logits1_val})
print(logits2_val)
# [[ 1.3904244   2.811252  ]
#  [-0.39521402 -1.6812694 ]
#  [-1.7728546  -4.522432  ]
#  [ 0.6836863   3.2234416 ]]

Note that the evaluated value of logits of first graph (logits1_val) is already a numpy array so you can feed it as is as input of the second graph. Same when you want to execute a train step for the second graph:

# train step for the second graph
logits1_val = sess1.run(logits1, feed_dict={x1:x_train})
loss_val2, _ = sess2.run([loss_fn2, train_op2], feed_dict={x2:logits1_val, y2:y_train})
print(loss_val2) # 0.8134985

UPDATE if we define both networks in the same graph:

import tensorflow as tf
from sklearn.datasets import make_blobs

x_train, y_train = make_blobs(n_samples=4,
                              n_features=2,
                              centers=[[1, 1], [-1, -1]],
                              cluster_std=0.5)

with tf.variable_scope('network_1'):
    x = tf.placeholder(tf.float32, shape=[None, 2])
    y = tf.placeholder(tf.int32, shape=[None])

    with tf.name_scope('network'):
        logits1 = tf.layers.dense(x, units=2)

with tf.variable_scope('network_2'):
    with tf.name_scope('network'):
        logits2 = tf.layers.dense(logits1, units=2) # <-- output of `network_1` is input to `network_2`

    with tf.name_scope('custom_loss'):
        # Define your custom loss here. I use cross-entropy
        # for illustration
        xentropy2 = tf.nn.sparse_softmax_cross_entropy_with_logits(
            labels=y, logits=logits2)
        custom_loss2 = tf.reduce_mean(xentropy2)

    with tf.name_scope('optimizer'):
        optimizer2 = tf.train.GradientDescentOptimizer(0.01)
        var_list = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES,
                                     scope='network_2')
        train_op2 = optimizer2.minimize(custom_loss2, var_list=var_list)

with tf.variable_scope('network_1'):
    # Take the `custom_loss2` from `network_2` and create a new custom loss
    # for `network_1`
    xentropy1 = tf.nn.sparse_softmax_cross_entropy_with_logits(
        labels=y, logits=logits1)
    custom_loss1 = tf.reduce_mean(xentropy1) + custom_loss2 # <-- loss from `network_2`
    optimizer1 = tf.train.AdamOptimizer(0.01)
    var_list = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES,
                                 scope='network_1')
    train_op1 = optimizer1.minimize(custom_loss1, var_list=var_list)

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    # grad update step + loss computation for first network
    loss1, _ = sess.run([custom_loss1, train_op1], feed_dict={x:x_train, y:y_train})
    print(loss1) # 0.44655064
    # grad update step + loss computation for second network
    loss2, _ = sess.run([custom_loss2, train_op2], feed_dict={x:x_train, y:y_train})
    print(loss2) # 0.3163877

Upvotes: 2

Related Questions