Reputation: 169
I have a binary mask as a tensor in tensorflow.
How can I convert this binary mask into bounding boxes using tensorflow operations?
Upvotes: 2
Views: 1924
Reputation: 182
After a bit of work i managed to solve it. Note that the solution given only works for a single object, however with a little bit of tweaking you could apply it to work with multiple objects also.
Basicly you want to check if there are any true pixels along an entire axis. You start off from the edge and move further inwards until you hit an axis with at least one true pixel. Do this for left, right, top and bottom.
image = tf.io.read_file('mask.png')
image = tf.io.decode_png(image)
image = tf.image.resize(image, size=(300, 300), method='nearest')
rows = tf.math.count_nonzero(image, axis=0, keepdims=None, dtype=tf.bool) # return true if any pixels in the given row is true
rows = tf.squeeze(rows, axis=1) #make a scalar
columns = tf.math.count_nonzero(image, axis=1, keepdims=None, dtype=tf.bool)
columns = tf.squeeze(columns, axis=1)
def indicies_by_value(value): return tf.where(tf.equal(value, True))[:,-1] #return all the indices where mask is present along given axis
#coordinates
y_min = indicies_by_value(columns)[0] #first true pixel along axis
y_max = indicies_by_value(columns)[-1] #last true pixel along axis
x_min = indicies_by_value(rows)[0]
x_max = indicies_by_value(rows)[-1]
#apply the bounding box
image = tf.image.convert_image_dtype(image, dtype=tf.float32)
img = tf.expand_dims(image, axis=-0)
img = tf.reshape(img, shape=[1, 300, 300, 1])
box = tf.stack([y_min, x_min, y_max, x_max], axis=0)
box = tf.math.divide(box, 300)
box = box.numpy()
boxes = box.reshape([1,1,4])
colors = np.array([[0.5, 0.9, 0.5], [0.5, 0.9, 0.5]])
boundning_box = tf.image.draw_bounding_boxes(img, boxes, colors)
tf.keras.preprocessing.image.save_img('boxed.png', boundning_box.numpy()[0])
Upvotes: 2