Reputation: 71
I ran into a problem problem of low frame capture efficiency in OpenCV.
Hardware & Software.
Task.
Get videostream from IP camera, recognize images and display resulting video (with marks and messages).
Important features: real-time processing, HD resolution (1280x720), high frame rate (>20 fps), continuous operation for several hours.
General algorithm: source video stream -> decoding and frame grabbing -> work with frames in OpenCV -> assembling the processed frames into a video stream -> display video using a Raspberry Pi GPU
OpenCV output/display method - imshow - does not work well even at low-resolution video. The only library that allows to use a Raspberry Pi GPU to decode and display video is a Gstreamer.
I compiled Gstreamer modules (gstreamer1.0-plugins-bad, gstreamer1.0-omx) with OMX support and tested it:
gst-launch-1.0 rtspsrc location='rtsp://web_camera_ip' latency=400 ! queue ! rtph264depay ! h264parse ! omxh264dec ! glimagesink
It works great, CPU usage is about 9%.
Next I compiled OpenCV with Gstreamer, NEON, VFPV3 support.
I use the following code for testing:
import cv2
import numpy as np
src='rtsp://web_camera_ip'
stream_in = cv2.VideoCapture(src)
pipeline_out = "appsrc ! videoconvert ! video/x-raw, framerate=20/1, format=RGBA ! glimagesink sync=false"
fourcc = cv2.VideoWriter_fourcc(*'H264')
stream_out = cv2.VideoWriter(pipeline_out, cv2.CAP_GSTREAMER, fourcc, 20.0, (1280,720))
while True:
ret, frame = stream_out.read()
if ret:
stream_out.write(frame)
cv2.waitKey(1)
It also worked, but not so well as Gstreamer itself. CPU usage is about 50%, without stream_out.write(frame) - 35%. At frame rate above 15, there are lags and delays.
4.1. Use Gstreamer to decode video stream:
pipline_in='rtspsrc location=rtsp://web_camera_ip latency=400 ! queue ! rtph264depay ! h264parse ! omxh264dec ! videoconvert ! appsink'
stream_in = cv2.VideoCapture(pipline_in)
It even worsened the situation - the CPU load increased by several percent, the delay has become more.
4.2. I also tried to optimize the library using method from PyImageSearch.com - threading using WebcamVideoStream from imutils library.
from threading import Thread
import cv2
import numpy as np
import imutils
src='rtsp://web_camera_ip'
stream_in = WebcamVideoStream(src).start()
pipeline_out = "appsrc ! videoconvert ! video/x-raw, framerate=20/1, format=RGBA ! glimagesink sync=false"
fourcc = cv2.VideoWriter_fourcc(*'H264')
stream_out = cv2.VideoWriter(pipeline_out, cv2.CAP_GSTREAMER, fourcc, 20.0, (1280,720))
while True:
frame = stream_in.read()
out.write(frame)
cv2.waitKey(1)
CPU usage has increased to 70%, the quality of the output video stream has not changed.
4.3 Сhanging the following parameters does not help: whaitKey(1-50), videostream bitrate (1000-5000 kB/s), videostream GOP (1-20).
As I understand, VideoCaputre/Videowritter methods has a very low efficiency. Maybe it's not noticeable on PC, but it is critical for Raspberry Pi 3.
Thanks in advance for answers!
UPDATE 1
I think I know what the problem is, but I don't know how to solve it.
The main problem is that videoconvert does not support GPU - the main CPU load is due to the color format conversion!
I tested this assumption using the "pure" Gstreamer, adding the videoconvert:
gst-launch-1.0 rtspsrc location='web_camera_ip' latency=400 ! queue ! rtph264depay ! h264parse ! omxh264dec ! videoconvert ! video/x-raw, format=BGR ! glimagesink sync=false
Black display, CPU load is 25%.
Check this pipline:
gst-launch-1.0 rtspsrc location='web_camera_ip' latency=400 ! queue ! rtph264depay ! h264parse ! omxh264dec ! videoconvert ! video/x-raw, format=RGBA ! glimagesink sync=false
Video is displayed, CPU load is 5%. I also assume that the omxh264dec converts the color format YUV to RGBA using GPU (after omxh264dec, videoconver does not load the CPU).
In this thread 6by9, Rapberry engineer and graphics programming specialist, writes that "The IL video_encode component supports OMX_COLOR_Format24bitBGR888 which I seem to recall maps to OpenCV's RGB".
Are there any ideas?
Upvotes: 5
Views: 7901
Reputation: 322
Do you really need to recognize every image that you've captured? You can use first pipeline for display image (you can use video overlay for watermarks and another artifacts), but decode for example every 6th image for CPU recognition. In this case, you'll use just GPU for capture and display video without CPU loading, and CPU for selectively image recognition
Upvotes: 1