Receiving coordinates from inference Pytorch

Question

I'm trying to get the coordinates of the pixels inside of a mask that is generated by Pytorches DefaultPredictor, to later on get the polygon corners and use this in my application.

However, DefaultPredictor produced a tensor of pred_masks, in the following format: [False, False ... False], ... [False, False, .. False] Where the length of each individual list is length of the image, and the number of total lists is the height of the image.

Now, as I need to get the pixel coordinates that are inside of the mask, the simple solution seemed to be looping through the pred_masks, checking the value and if == "True" creating tuples of these and adding them to a list. However, as we are talking about images with width x height of about 3200 x 1600, this is a relatively slow process (~4 seconds to loop through a single 3200x1600, yet as there are quite some objects for which I need to get the inference in the end - this will end up being incredibly slow).

What would be the smarter way to get the the coordinates (mask) of the detected object using the pytorch (detectron2) model?

Please find my code below for reference:

from __future__ import print_function

from detectron2.engine import DefaultPredictor
from detectron2.config import get_cfg
from detectron2.data import MetadataCatalog
from detectron2.data.datasets import register_coco_instances


import cv2
import time


# get image
start = time.time()

im = cv2.imread("inputImage.jpg")

# Create config
cfg = get_cfg()
cfg.merge_from_file("detectron2_repo/configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml")
cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.5 # Set threshold for this model
cfg.MODEL.WEIGHTS = "model_final.pth" # Set path model .pth
cfg.MODEL.ROI_HEADS.NUM_CLASSES = 1
cfg.MODEL.DEVICE='cpu'

register_coco_instances("dataset_test",{},"testval.json","Images_path")
test_metadata = MetadataCatalog.get("dataset_test")


# Create predictor
predictor = DefaultPredictor(cfg)

# Make prediction
outputs = predictor(im)

 
#Loop through the pred_masks and check which ones are equal to TRUE, if equal, add the pixel values to the true_cords_list
outputnump = outputs["instances"].pred_masks.numpy()

true_cords_list = []
x_length = range(len(outputnump[0][0]))
#y kordinaat on range number
for y_cord in range(len(outputnump[0])):
    #x cord
    for x_cord in x_length:
        if str(outputnump[0][y_cord][x_cord]) == "True":
            inputcoords = (x_cord,y_cord)
            true_cords_list.append(inputcoords)

print(str(true_cords_list))

end = time.time()

print(f"Runtime of the program is {end - start}") # 14.29468035697937

//

EDIT: After changing the for loop partially to compress - I've managed to reduce the runtime of the for loop by ~3x - however, ideally I would like to receive this from the predictor itself if possible.

y_length = len(outputnump[0])
x_length = len(outputnump[0][0])
true_cords_list = []
for y_cord in range(y_length):
    x_cords = list(compress(range(x_length), outputnump[0][y_cord]))
    if x_cords:
        for x_cord in x_cords:
            inputcoords = (x_cord,y_cord)
            true_cords_list.append(inputcoords)

maxim velikanov · Accepted Answer

The problem is easily solvable with sufficient knowledge about NumPy or PyTorch native array handling, which allows 100x speedups compared to Python loops. You can study the NumPy library, and PyTorch tensors are similar to NumPy in behaviour.

How to get indices of values in NumPy:

import numpy as np
arr = np.random.rand(3,4) > 0.5
ind = np.argwhere(arr)[:, ::-1]
print(arr)
print(ind)

In your particular case this will be

ind = np.argwhere(outputnump[0])[:, ::-1]

How to get indices of values in PyTorch:

import torch
arr = torch.rand(3, 4) > 0.5
ind = arr.nonzero()
ind = torch.flip(ind, [1])
print(arr)
print(ind)

[::-1] and .flip are used to inverse the order of coordinates from (y, x) to (x, y).

NumPy and PyTorch even allow checking simple conditions and getting the indices of values that meet these conditions, for further understanding see the according NumPy docs article

When asking, you should provide links for your problem context. This question is actually about Facebook object detector, where they provide a nice demo Colab notebook.

Receiving coordinates from inference Pytorch

Answers (1)

Related Questions