Fazzolini
Fazzolini

Reputation: 5492

Having trouble parallelizing a Python process where Processes depend on each other

I have a deep learning application on Raspberry Pi 3B+. I have a loop that first grabs a frame from a camera and then passes it to a neural net for prediction and then displays the frame with the prediction on screen:

while True:
    frame = cam.get_frame()

    preds = model.predict(frame)

    label, score, c, pts = get_best_classes(preds, model)

    print("{} ({}): {:.2f}".format(label, c, score))

    screen.draw(frame, pts)

Grabbing a frame and displaying is very fast (real-time) but the prediction is around 0.7 seconds. When I run it there is about 4 seconds of lag, meaning when I move the camera, the output on the screen will move only after 4 seconds. I researched this and it's because frames pile up before the model can predict. The solution is to put getting frames and prediction on different threads, but I don't have experience with threading or multiprocessing.

I have Googled numerous tutorials but they are all examples with printing things in parallel. I couldn't find an introductory tutorial where one process (prediction in my case) depends on another process's output (a frame being grabbed from the camera).

So my question has 3 parts:

  1. could you please point me in the direction of a possible solution?
  2. could you please provide some links to tutorials where processes share data and one process only starts doing its part when another process finished its part
  3. how to make sure processes run in an infinite loop?

Upvotes: 1

Views: 155

Answers (2)

nathancy
nathancy

Reputation: 46660

If you choose to go with the multiprocessing route, you can either use a multiprocessing Queue or a Pipe to share data between processes. Queues are both thread and process safe. You will have to be more careful when using a Pipe as the data in a pipe may become corrupted if two processes (or threads) try to read from or write to the same end of the pipe at the same time. Of course there is no risk of corruption from processes using different ends of the pipe at the same time.

For your application, I would recommend going with the multithreading route. The idea is to have two threads to avoid sequentially getting and processing frames.

  • Thread #1 - Dedicated to only reading frames from the camera stream.
  • Thread #2 - Dedicated for processing frames (predicting).

We separate reading frames from processing because cv2.VideoCapture.read() is a blocking operation. Thus we read frames in its own independent thread to 'improve' FPS by reducing latency due to I/O operations. In addition, by isolating frame capture to its own thread, there will always be a frame ready to be processed instead of having to wait for the I/O operation to complete and return a new frame. In our main thread dedicated to processing, we are now freely able to predict without having to wait for the camera to grab the next frame.

Here is a widget that is dedicated to only reading frames from the camera stream. In the main program, you are able to freely process/predict using the latest frame.

from threading import Thread
import cv2

class VideoStreamWidget(object):
    def __init__(self, src=0):
        # Create a VideoCapture object
        self.capture = cv2.VideoCapture(src)

        # Start the thread to read frames from the video stream
        self.thread = Thread(target=self.update, args=())
        self.thread.daemon = True
        self.thread.start()

    def update(self):
        # Read the next frame from the stream in a different thread
        while True:
            if self.capture.isOpened():
                (self.status, self.frame) = self.capture.read()

    def show_frame(self):
        # Display frames in main program
        if self.status:
            cv2.imshow('frame', self.frame)

        # Press Q on keyboard to stop stream 
        key = cv2.waitKey(1)
        if key == ord('q'):
            self.capture.release()
            cv2.destroyAllWindows()
            exit(1)

    def grab_latest_frame(self):
        return self.frame

if __name__ == '__main__':
    video_stream_widget = VideoStreamWidget(0)
    while True:
        try:
            video_stream_widget.show_frame()
            latest_frame = video_stream_widget.grab_latest_frame()
            # Do processing here with the latest frame
            # ...
            # ...
        except AttributeError:
            pass

Upvotes: 1

Fazzolini
Fazzolini

Reputation: 5492

I found the reason for the delay. It turns out that cam.get_frame() (which is a thin wrapper around cv2.VideoCapture().read() that I wrote) has about 5-7 prebuffered frames. So each time I called it, it did not return the current frame but the next frame from the buffer. I found this solution to be helpful.

After modifying the code the lag disappeared:

# buffered frames
n = 5
while True:
    # ignore buffered frames
    for _ in range(n):
        frame = cam.get_frame()

    preds = model.predict(frame)

    label, score, c, pts = get_best_classes(preds, model)

    print("{} ({}): {:.2f}".format(label, c, score))

    screen.draw(frame, pts)

Upvotes: 0

Related Questions