Engineero
Engineero

Reputation: 12908

Creating a matrix-Tensor of operations

I am trying to implement a kind of nonlinear filter in TensorFlow, but I am having trouble with the implementation for one step. The step is basically something like:

x_update = x.assign(tf.matmul(A, x))

The problem is that the matrix A is structured something like:

A = [[1, 0.1, 0, 0, 0],
     [0, 1, 0, 0, 0],
     [0, 0, f1(x), f2(x), f3(x)],
     [0, 0, f4(x), f5(x), f6(x)],
     [0, 0, 0, 0, 1]]

Where each fn(x) is a nonlinear function of my state; something like tf.sin(x[4]) or even x[2]**2 * tf.sin(x[4]) + x[3]**2 * tf.cos(x[4]).

I do not know how to create my A matrix such that it embeds these operations. I start by initializing it with some values:

A_mat = np.eye(5)
A_mat[0, 1] = 0.1
A = tf.Variable(A_mat, dtype=tf.float32, trainable=False, name='A')

Then I was trying to do some slice updating with tf.scatter_update, something like:

# Define my nonlinear operations.
f1 = tf.cos(...)
f2 = tf.sin(...)
# ...

# Define the part that I want to substitute.
new_part = tf.constant(tf.convert_to_tensor([[f1, f2, f3],
                                             [f4, f5, f6]]))

# Define slice indices and update the matrix.
inds = [vals for vals in zip(np.arange(1, 3), np.arange(2, 5))]
A_update = tf.scatter_update(A, tf.constant(inds), new_part, name='A_update')

This gives me an error stating:

ValueError: Shapes must be equal rank, but are 1 and 0

From merging shape 1 with other shapes. for 'packed/0' (op: 'Pack') with input shapes: [1], [1], [], [], [], [].

I have also tried just assigning my matrix new_part back into the numpy-defined A_mat, but I get a different error, which I think is due to the unexpected datatype when a numeric array suddenly gets assigned Tensor elements.

So does anybody know how to define a matrix of operations that update when the matrix is used like this?

Ideally I would like to define the matrix A so that all the operations that update within A are a part of the call to A and happen automatically. That way I can avoid slice assignment altogether, and it would just feel more TensorFlow-y.

Thank you!


Update:

I got it past the errors with a combination of wrapping my ops in tf.reshape(op_name, []) and changing my update to:

new_part = tf.convert_to_tensor([[0, 0, f1, f2, f3],
                                 [0, 0, f4, f5, f6]]))
rows = np.arange(start_row, end_row)
A_update = tf.scatter_update(A, rows, new_part, name='A_update')

It turns out that tf.scatter_update can only operate on the first dimension of a Variable, so I have to feed full rows to it and row indices where I want to put them. This helps, but still leaves my question:


My question:

What is the best, most TensorFlow-y way of defining this A matrix so that those elements that are constant remain constant, and those elements that are operations of other tensors on my graph are embedded in A as such? I want a call to A on my graph to go through and run those updates without needing to manually do this tf.scatter_update. Or is that the correct approach for this?

Upvotes: 1

Views: 757

Answers (1)

P-Gn
P-Gn

Reputation: 24581

The easiest way to update a submatrix is to use tensorflow's python slicing ops.

import numpy as np
import tensorflow as tf
A = tf.Variable(np.zeros((5, 5), dtype=np.float32), trainable=False)
new_part = tf.ones((2,3))

update_A = A[2:4,2:5].assign(new_part)

sess = tf.InteractiveSession()
tf.global_variables_initializer().run()
print(update_A.eval())
# array([[ 0.,  0.,  0.,  0.,  0.],
#        [ 0.,  0.,  0.,  0.,  0.],
#        [ 0.,  0.,  1.,  1.,  1.],
#        [ 0.,  0.,  1.,  1.,  1.],
#        [ 0.,  0.,  0.,  0.,  0.]], dtype=float32)

Upvotes: 2

Related Questions