reese0106
reese0106

Reputation: 2061

Tensorflow Extract Indices Not Equal to Zero

I want to return a dense tensor of the non-zero indices for each row. For example, given the tensors:

[0,1,1]
[1,0,0]
[0,0,1]
[0,1,0]

Should return

[1,2]
[0]
[2]
[1]

I can get the indices using tf.where(), but I do not know how to combine the results based on the first index. For example:

graph = tf.Graph()
with graph.as_default():
    data = tf.constant([[0,1,1],[1,0,0],[0,0,1],[0,1,0]])
    indices = tf.where(tf.not_equal(data,0))
sess = tf.InteractiveSession(graph=graph)
sess.run(tf.local_variables_initializer())
print(sess.run([indices]))

The above code returns:

[array([[0, 1],
       [0, 2],
       [1, 0],
       [2, 2],
       [3, 1]])]

However, I would like to combine the result based on first column of these indices. Can anybody suggest a way to do this?

UPDATE

Trying to get this to work for a larger number of dimensions and running into an error. If I run the code below on the matrix

sess = tf.InteractiveSession()
a = tf.constant([[0, 1, 1, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 1, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 1, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 1, 1, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 1, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 1, 0, 1],
       [0, 0, 0, 0, 0, 0, 0, 0, 1, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 1],
       [1, 0, 0, 0, 0, 0, 0, 0, 0, 1]])
row_counts = tf.reduce_sum(a, axis=1)
max_padding = tf.reduce_max(row_counts)
extra_padding = max_padding - row_counts
extra_padding_col = tf.expand_dims(extra_padding, 1)
range_row = tf.expand_dims(tf.range(max_padding), 0)
padding_array = tf.cast(tf.tile(range_row, [9, 1])<extra_padding_col, tf.int32)
b = tf.concat([a, padding_array], axis=1)
result = tf.map_fn(lambda x: tf.cast(tf.where(tf.not_equal(x, 0)), tf.int32), b)
result = tf.where(result<=max_padding, result, -1*tf.ones_like(result)) # replace with -1's
result = tf.reshape(result, (int(result.get_shape()[0]), max_padding))
result.eval()

Then I will get too many -1's so the solution seems to not quite be there:

[[ 1,  2],
       [ 2, -1],
       [-1, -1],
       [-1, -1],
       [-1, -1],
       [-1, -1],
       [-1, -1],
       [-1, -1],
       [ 0, -1]]

Upvotes: 4

Views: 3291

Answers (1)

Yaroslav Bulatov
Yaroslav Bulatov

Reputation: 57893

Notice that in your example, the output is not a matrix but a jagged array. Jagged arrays have limited support in TensorFlow (through TensorArray), so it's more convenient to deal with rectangular arrays. You could pad each row with -1's to make the output rectangular

Suppose your output was already rectangular, without padding you could use map_fn as follows

tf.reset_default_graph()
sess = tf.InteractiveSession()
a = tf.constant([[0,1,1],[1,1,0],[1,0,1],[1,1,0]])
# cast needed because map_fn likes to keep same dtype, but tf.where returns int64
result = tf.map_fn(lambda x: tf.cast(tf.where(tf.not_equal(x, 0)), tf.int32), a)
# remove extra level of nesting
sess.run(tf.reshape(result, (4, 2)))

Output is

array([[1, 2],
       [0, 1],
       [0, 2],
       [0, 1]], dtype=int32)

When padding is needed, you could do something like this

sess = tf.InteractiveSession()
a = tf.constant([[0, 1, 1, 0, 0, 0, 0, 0, 0, 0],
   [0, 0, 1, 0, 0, 0, 0, 0, 0, 0],
   [0, 0, 0, 0, 1, 0, 0, 0, 0, 0],
   [0, 0, 0, 0, 0, 1, 1, 0, 0, 0],
   [0, 0, 0, 0, 0, 0, 1, 0, 0, 0],
   [0, 0, 0, 0, 0, 0, 0, 1, 0, 1],
   [0, 0, 0, 0, 0, 0, 0, 0, 1, 0],
   [0, 0, 0, 0, 0, 0, 0, 0, 0, 1],
   [1, 0, 0, 0, 0, 0, 0, 0, 0, 1]])
row_counts = tf.reduce_sum(a, axis=1)
max_padding = tf.reduce_max(row_counts)
max_index = int(a.get_shape()[1])
extra_padding = max_padding - row_counts
extra_padding_col = tf.expand_dims(extra_padding, 1)
range_row = tf.expand_dims(tf.range(max_padding), 0)
num_rows = tf.squeeze(tf.shape(a)[0])
padding_array = tf.cast(tf.tile(range_row, [num_rows, 1])<extra_padding_col, tf.int32)
b = tf.concat([a, padding_array], axis=1)
result = tf.map_fn(lambda x: tf.cast(tf.where(tf.not_equal(x, 0)), tf.int32), b)
result = tf.where(result<max_index, result, -1*tf.ones_like(result)) # replace with -1's
result = tf.reshape(result, (int(result.get_shape()[0]), max_padding))
result.eval()

This should produce

array([[ 1,  2],
       [ 2, -1],
       [ 4, -1],
       [ 5,  6],
       [ 6, -1],
       [ 7,  9],
       [ 8, -1],
       [ 9, -1],
       [ 0,  9]], dtype=int32)

Upvotes: 1

Related Questions