ashboy64
ashboy64

Reputation: 93

Deep copy of tensor in tensorflow python

In some of my code, I have created a neural network using tensorflow and have access to a tensor representing that network's output. I want to make a copy of this tensor so that even if I train the neural network more, I can access the original value of the tensor.

Following other answers and tensorflow documentation, I have tried the tf.identity() function, but it does not seem to be doing what I need. Some other links suggested the use of tf.tile(), but this did not help either. I do not wish to use sess.run(), evaluate the tensor, and store it elsewhere.

Here is a toy example that describes what I need to do:

import tensorflow as tf
import numpy as np

t1 = tf.placeholder(tf.float32, [None, 1])
t2 = tf.layers.dense(t1, 1, activation=tf.nn.relu)
expected_out = tf.placeholder(tf.float32, [None, 1])

loss = tf.reduce_mean(tf.square(expected_out - t2))
train_op = tf.train.AdamOptimizer(1e-4).minimize(loss)

sess = tf.Session()

sess.run(tf.global_variables_initializer())

print(sess.run(t2, feed_dict={t1: np.array([1]).reshape(-1,1)}))
t3 = tf.identity(t2) # Need to make copy here
print(sess.run(t3, feed_dict={t1: np.array([1]).reshape(-1,1)}))

print("\nTraining \n")

for i in range(1000):
    sess.run(train_op, feed_dict={t1: np.array([1]).reshape(-1,1), expected_out: np.array([1]).reshape(-1,1)})

print(sess.run(t2, feed_dict={t1: np.array([1]).reshape(-1,1)}))
print(sess.run(t3, feed_dict={t1: np.array([1]).reshape(-1,1)}))

The result of the above code is that t2 and t3 have the same value.

[[1.5078927]]
[[1.5078927]]

Training

[[1.3262703]]
[[1.3262703]]

What I want is for t3 to keep its value from being copied.

[[1.5078927]]
[[1.5078927]]

Training

[[1.3262703]]
[[1.5078927]]

Thanks in advance for your help.

Upvotes: 4

Views: 5124

Answers (2)

Jose Alberto Salazar
Jose Alberto Salazar

Reputation: 89

I think that maybe copy.deepcopy() could work... for example:

import copy 
tensor_2 = copy.deepcopy(tensor_1)

Python doc about deepcopy: https://docs.python.org/3/library/copy.html

Upvotes: 0

a_guest
a_guest

Reputation: 36249

You can use a named tf.assign operation and then run only that operation via Graph.get_operation_by_name. This won't fetch the tensor's value but just run the assign operation on the graph. Consider the following example:

import tensorflow as tf

a = tf.placeholder(tf.int32, shape=(2,))
w = tf.Variable([1, 2])  # Updated in the training loop.
b = tf.Variable([0, 0])  # Backup; stores intermediate result.
t = tf.assign(w, tf.math.multiply(a, w))  # Update during training.
tf.assign(b, w, name='backup')

init_op = tf.global_variables_initializer()
with tf.Session() as sess:
    sess.run(init_op)
    x = [2, 2]
    # Emulate training loop:
    for i in range(3):
        print('w = ', sess.run(t, feed_dict={a: x}))
    # Backup without retrieving the value (returns None).
    print('Backup now: ', end='')
    print(sess.run(tf.get_default_graph().get_operation_by_name('backup')))
    # Train a bit more:
    for i in range(3):
        print('w = ', sess.run(t, feed_dict={a: x}))
    # Check the backed-up value:
    print('Backup: ', sess.run(b))  # Is [8, 16].

So for your example you could do:

t3 = tf.Variable([], validate_shape=False)
tf.assign(t3, t2, validate_shape=False, name='backup')

Upvotes: 1

Related Questions