Frederik Heber
Frederik Heber

Reputation: 85

tf.assign on tf.concat tensor, drops Variable character of tensors?

I am trying to set specific values for the weights and values of a Tensorflow neural network using the Python API. To this end, I placed all weights and biases in a common collection with proper reshaping and using tf.concat on the tensors from each layer.

At a certain stage in my code, I retrieve said collection. However, when I then try to tf.assign (using tf.placeholder of the same shape) to these concatenated tensor in order to set all weights/biases from a single vector of values, e.g. sitting in the feed_dict, then I get the error

AttributeError: 'Tensor' object has no attribute 'assign'

I have boiled my problem down to a minimum working example (MWE) as follows:

import tensorflow as tf

a=tf.Variable(tf.random_uniform([2], dtype=tf.float32))
b=tf.Variable(tf.random_uniform([2], dtype=tf.float32))
c=tf.concat([a,b], axis=0)

d_all=tf.placeholder(shape=[4], dtype=tf.float32)
d_single=tf.placeholder(shape=[2], dtype=tf.float32)

#e_all=tf.assign(c,d_all)
e_single=tf.assign(a,d_single)

sess=tf.Session()
sess.run(tf.global_variables_initializer())

print(a)
print(d_single)

sess.run(e_single, feed_dict={
    d_single: [1,2]
})

print(c)
print(d_all)

#sess.run(e_all, feed_dict={
#    d_all: [1,2,3,4]
#})

The commented-out lines do not work and fail with the same error. It seems that the tensor resulting from tf.concat is not variable anymore and therefore does not have the assign property. I found a related issue here, but my problem is not solved by validate_shape as suggested there.

Any ideas? Is this desired behavior?

Upvotes: 2

Views: 461

Answers (1)

Maxim
Maxim

Reputation: 53758

Yes, it's a designed behavior, because c is an op, not a variable. Here's the simplest version of it:

c = a + b
tf.assign(c, a)  # Does not work!

Basically, this graph means that the node c depends on a and b through certain operation (concat, addition, whatever). Assigning other values to c conflicts with the values that are coming from a and b, in other words, it breaks the computational graph.

What you should do instead is split d_all into tensors of shape [2] and assign the underlying a and b. This way is perfectly valid.

Upvotes: 2

Related Questions