Standard
Standard

Reputation: 1522

OpenCV capturing very slowly when using higher resolutions

I read images from a webcam and find some movement data in it. I wrote a small class which measures all the steps from reading the image, detecting movement, etc and then prints it.

Now when I just init the webcam without a particular resolution, the taken frame has the size of 640x480. Now I wanted to increase the resolution, so I set it to 1920x1080:

WEBCAM_RAW_RES = (640, 480)
FRAMERATE = 20
vid = cv2.VideoCapture(0)
vid.set(cv2.CAP_PROP_FPS, FRAMERATE)
vid.set(cv2.CAP_PROP_FRAME_WIDTH, WEBCAM_RAW_RES[0])
vid.set(cv2.CAP_PROP_FRAME_HEIGHT, WEBCAM_RAW_RES[1])

It works, BUT the vid.read() statement went from ~10ms avg to >200ms average time. So I changed it to 1280x800, the vid.read() statement went down to ~80ms.

Here are by the way the supported webcam resolutions:

pi@rpi:~/ $ v4l2-ctl -d /dev/video0 --list-formats-ext
ioctl: VIDIOC_ENUM_FMT
        Type: Video Capture

        [0]: 'MJPG' (Motion-JPEG, compressed)
                Size: Discrete 1920x1080
                        Interval: Discrete 0.033s (30.000 fps)
                        Interval: Discrete 0.040s (25.000 fps)
                        Interval: Discrete 0.050s (20.000 fps)
                        Interval: Discrete 0.067s (15.000 fps)
                        Interval: Discrete 0.100s (10.000 fps)
                        Interval: Discrete 0.200s (5.000 fps)
                Size: Discrete 1280x800
                        Interval: Discrete 0.033s (30.000 fps)
                        Interval: Discrete 0.040s (25.000 fps)
                        Interval: Discrete 0.050s (20.000 fps)
                        Interval: Discrete 0.067s (15.000 fps)
                        Interval: Discrete 0.100s (10.000 fps)
                        Interval: Discrete 0.200s (5.000 fps)
                Size: Discrete 1280x720
                        Interval: Discrete 0.033s (30.000 fps)
                        Interval: Discrete 0.040s (25.000 fps)
                        Interval: Discrete 0.050s (20.000 fps)
                        Interval: Discrete 0.067s (15.000 fps)
                        Interval: Discrete 0.100s (10.000 fps)
                        Interval: Discrete 0.200s (5.000 fps)
                Size: Discrete 640x400
                        Interval: Discrete 0.033s (30.000 fps)
                        Interval: Discrete 0.040s (25.000 fps)
                        Interval: Discrete 0.050s (20.000 fps)
                        Interval: Discrete 0.067s (15.000 fps)
                        Interval: Discrete 0.100s (10.000 fps)
                        Interval: Discrete 0.200s (5.000 fps)
                Size: Discrete 320x240
                        Interval: Discrete 0.033s (30.000 fps)
                        Interval: Discrete 0.040s (25.000 fps)
                        Interval: Discrete 0.050s (20.000 fps)
                        Interval: Discrete 0.067s (15.000 fps)
                        Interval: Discrete 0.100s (10.000 fps)
                        Interval: Discrete 0.200s (5.000 fps)
                Size: Discrete 640x480
                        Interval: Discrete 0.033s (30.000 fps)
                        Interval: Discrete 0.040s (25.000 fps)
                        Interval: Discrete 0.050s (20.000 fps)
                        Interval: Discrete 0.067s (15.000 fps)
                        Interval: Discrete 0.100s (10.000 fps)
                        Interval: Discrete 0.200s (5.000 fps)
        [1]: 'YUYV' (YUYV 4:2:2)
                Size: Discrete 1920x1080
                        Interval: Discrete 0.200s (5.000 fps)
                        Interval: Discrete 0.333s (3.000 fps)
                Size: Discrete 1280x800
                        Interval: Discrete 0.100s (10.000 fps)
                        Interval: Discrete 0.200s (5.000 fps)
                Size: Discrete 1280x720
                        Interval: Discrete 0.100s (10.000 fps)
                        Interval: Discrete 0.200s (5.000 fps)
                Size: Discrete 640x400
                        Interval: Discrete 0.033s (30.000 fps)
                        Interval: Discrete 0.040s (25.000 fps)
                        Interval: Discrete 0.050s (20.000 fps)
                        Interval: Discrete 0.067s (15.000 fps)
                Size: Discrete 320x240
                        Interval: Discrete 0.033s (30.000 fps)
                        Interval: Discrete 0.040s (25.000 fps)
                        Interval: Discrete 0.050s (20.000 fps)
                        Interval: Discrete 0.067s (15.000 fps)
                Size: Discrete 640x480
                        Interval: Discrete 0.033s (30.000 fps)
                        Interval: Discrete 0.040s (25.000 fps)
                        Interval: Discrete 0.050s (20.000 fps)
                        Interval: Discrete 0.067s (15.000 fps)

How do I get a higher resolution without a 200ms delay? Do I have to switch to a MJPEG stream format? I couldn't really find something about it on the web...


Here is my whole code:

import os
import cv2
from MotionDetectorBlob import MotionDetectorBlob
from utils import CalcTimer, resize_and_crop
from config import *

WEBCAM_RAW_RES = (640, 480)
FRAMERATE = 20

def start_webcam():
    print("Setting up webcam and finding the correct brightness.")
    vid = cv2.VideoCapture(0)
    vid.set(cv2.CAP_PROP_FPS, FRAMERATE)
    vid.set(cv2.CAP_PROP_FRAME_WIDTH, WEBCAM_RAW_RES[0])
    vid.set(cv2.CAP_PROP_FRAME_HEIGHT, WEBCAM_RAW_RES[1])
    vid.set(cv2.CAP_PROP_AUTO_EXPOSURE, 3)  # auto mode
    time.sleep(5)  # find brightness
    #vid.set(cv2.CAP_PROP_AUTO_EXPOSURE, 1)  # manual exposure mode
    vid.set(cv2.CAP_PROP_AUTO_WB, 0)  # manual awb
    vid.set(cv2.CAP_PROP_WB_TEMPERATURE, 4000)  # manual awb
    print("Webcam setup done.")

    detector = MotionDetectorBlob()

    firstrun = True

    timer = CalcTimer()
    while True:

        # Capture the video frame
        # by frame
        try:
            timer.start()
            ret, frame = vid.read()
            print(f"original webcam res: {frame.shape}")
            timer.measure("read")

            frame = cv2.rotate(frame, cv2.ROTATE_90_CLOCKWISE)
            timer.measure("rotate")

            # as we skip analyser, directly write the frames
            data = detector.detect(frame)

            print(timer.results())
        except KeyboardInterrupt:
            break

    # After the loop release the cap object
    vid.release()

start_webcam()

Upvotes: 1

Views: 2129

Answers (2)

foodybug
foodybug

Reputation: 1

In my case, it worked well if I also changed fourcc every time I changed the resolution. When changing the resolution internally, it seems to be set to the default format unless the video format is specified separately.

Upvotes: 0

PaulvdBoor
PaulvdBoor

Reputation: 442

The main reason for the slow performance of vid.read() is that you are using the YUYV format, which is uncompressed and requires more bandwidth and processing power than the MJPG format, which is compressed and can handle higher resolutions faster.

To switch to the MJPG format, you need to add one more line to your code after initializing the vid object:

vid = cv2.VideoCapture(0)
vid.set(cv2.CAP_PROP_FPS, FRAMERATE)
vid.set(cv2.CAP_PROP_FRAME_WIDTH, WEBCAM_RAW_RES[0])
vid.set(cv2.CAP_PROP_FRAME_HEIGHT, WEBCAM_RAW_RES[1])
vid.set(cv2.CAP_PROP_FOURCC, cv2.VideoWriter_fourcc(*"MJPG")) # add this line

This will tell OpenCV to use the MJPG codec for capturing the video stream. You can also check the supported formats and codecs of your webcam by using the vid.get() method with the appropriate property codes, such as cv2.CAP_PROP_FOURCC and cv2.CAP_PROP_FORMAT.

Explanation: The YUYV format is a raw format that encodes each pixel with two bytes, one for luminance (Y) and one for chrominance (UV). This means that for a 1920x1080 resolution, each frame will have 1920x1080x2 = 4147200 bytes, or about 4 MB. To transfer this amount of data at 20 FPS, you need a bandwidth of 4x20 = 80 MB/s, which is quite high for a USB webcam.

The MJPG format is a compressed format that uses the JPEG algorithm to encode each frame as an image. This reduces the size of each frame significantly, depending on the quality and complexity of the image. For example, a typical JPEG image of 1920x1080 resolution can have a size of about 200 KB, or 0.2 MB. To transfer this amount of data at 20 FPS, you need a bandwidth of 0.2x20 = 4 MB/s, which is much lower than the YUYV format. Therefore, by using the MJPG format, you can reduce the latency and increase the speed of vid.read(), as well as save some CPU cycles for processing the frames.

Upvotes: 3

Related Questions