What is the fastest Mask R-CNN implementation available

I'm running a Mask R-CNN model on an edge device (with an NVIDIA GTX 1080). I am currently using the Detectron2 Mask R-CNN implementation and I archieve an inference speed of around 5 FPS.

To speed this up I looked at other inference engines and model implementations. For example ONNX, but I'm not able to gain a faster inference speed.

TensorRT looks very promising to me but I did not found a ready "out-of-the-box" implementation for it.

Are there any other mature and fast inference engines or other techniques to speed up the inference?

Upvotes: 7

Answers (3)

Sharif Elfouly

Reputation: 518

As @kkHarshit already mentioned it is very hard to speed up a Mask R-CNN any further.

The fastest instance segmentation model that I found is YolactEdge: Real-time Instance Segmentation on the Edge (Jetson AGX Xavier: 30 FPS, RTX 2080 Ti: 170 FPS).

It's perfomance is worse than Mask R-CNN or Yolact even but still very good.

Upvotes: 0

Yashas

Reputation: 1234

OpenCV 4.5.0 with DNN_BACKEND_CUDA and DNN_TARGET_CUDA/DNN_TARGET_CUDA_FP16.

Mask RCNN with 1024 x 1024 input image

Device             | FPS
------------------ | -------
GTX 1080 Ti (FP32) | 29
RTX 2080 Ti (FP16) | 60

FPS measured includes NMS but excludes other preprocessing and postprocessing. The network fully runs end-to-end on GPU.

Benchmark code: https://gist.github.com/YashasSamaga/48bdb167303e10f4d07b754888ddbdcf

Upvotes: 2

Harshit Kumar

Reputation: 12867

It's almost impossible to get higher inference speed for Mask R-CNN on GTX 1080. You may check detectron2 by Facebook AI Research.

Otherwise, I'd suggest to use YOLACT - (You Only Look At CoefficienTs), it can achieve real-time instance segmentation.

On the other hand, if you don't need instance segmentation, you can use YOLO, SSD, etc for object detection.

Upvotes: 3

What is the fastest Mask R-CNN implementation available

Answers (3)

Related Questions