Reputation: 12908
I am trying to implement a kind of nonlinear filter in TensorFlow, but I am having trouble with the implementation for one step. The step is basically something like:
x_update = x.assign(tf.matmul(A, x))
The problem is that the matrix A
is structured something like:
A = [[1, 0.1, 0, 0, 0],
[0, 1, 0, 0, 0],
[0, 0, f1(x), f2(x), f3(x)],
[0, 0, f4(x), f5(x), f6(x)],
[0, 0, 0, 0, 1]]
Where each fn(x)
is a nonlinear function of my state; something like tf.sin(x[4])
or even x[2]**2 * tf.sin(x[4]) + x[3]**2 * tf.cos(x[4])
.
I do not know how to create my A
matrix such that it embeds these operations. I start by initializing it with some values:
A_mat = np.eye(5)
A_mat[0, 1] = 0.1
A = tf.Variable(A_mat, dtype=tf.float32, trainable=False, name='A')
Then I was trying to do some slice updating with tf.scatter_update
, something like:
# Define my nonlinear operations.
f1 = tf.cos(...)
f2 = tf.sin(...)
# ...
# Define the part that I want to substitute.
new_part = tf.constant(tf.convert_to_tensor([[f1, f2, f3],
[f4, f5, f6]]))
# Define slice indices and update the matrix.
inds = [vals for vals in zip(np.arange(1, 3), np.arange(2, 5))]
A_update = tf.scatter_update(A, tf.constant(inds), new_part, name='A_update')
This gives me an error stating:
ValueError: Shapes must be equal rank, but are 1 and 0
From merging shape 1 with other shapes. for 'packed/0' (op: 'Pack') with input shapes: [1], [1], [], [], [], [].
I have also tried just assigning my matrix new_part
back into the numpy-defined A_mat
, but I get a different error, which I think is due to the unexpected datatype when a numeric array suddenly gets assigned Tensor elements.
So does anybody know how to define a matrix of operations that update when the matrix is used like this?
Ideally I would like to define the matrix A
so that all the operations that update within A
are a part of the call to A
and happen automatically. That way I can avoid slice assignment altogether, and it would just feel more TensorFlow-y.
Thank you!
I got it past the errors with a combination of wrapping my ops in tf.reshape(op_name, [])
and changing my update to:
new_part = tf.convert_to_tensor([[0, 0, f1, f2, f3],
[0, 0, f4, f5, f6]]))
rows = np.arange(start_row, end_row)
A_update = tf.scatter_update(A, rows, new_part, name='A_update')
It turns out that tf.scatter_update
can only operate on the first dimension of a Variable, so I have to feed full rows to it and row indices where I want to put them. This helps, but still leaves my question:
What is the best, most TensorFlow-y way of defining this A
matrix so that those elements that are constant remain constant, and those elements that are operations of other tensors on my graph are embedded in A
as such? I want a call to A
on my graph to go through and run those updates without needing to manually do this tf.scatter_update
. Or is that the correct approach for this?
Upvotes: 1
Views: 757
Reputation: 24581
The easiest way to update a submatrix is to use tensorflow's python slicing ops.
import numpy as np
import tensorflow as tf
A = tf.Variable(np.zeros((5, 5), dtype=np.float32), trainable=False)
new_part = tf.ones((2,3))
update_A = A[2:4,2:5].assign(new_part)
sess = tf.InteractiveSession()
tf.global_variables_initializer().run()
print(update_A.eval())
# array([[ 0., 0., 0., 0., 0.],
# [ 0., 0., 0., 0., 0.],
# [ 0., 0., 1., 1., 1.],
# [ 0., 0., 1., 1., 1.],
# [ 0., 0., 0., 0., 0.]], dtype=float32)
Upvotes: 2