Viper
Viper

Reputation: 121

How to save the region of interest of a frame using openCV?

I am creating a Deep Learning program that is capable to identifying a hand digit gesture. I have completed training the model and now I need to use it on a live video. So I am trying to create an openCV program where the user will place his/her hand in a region of interest( a box ) in a frame and that ROI will be fed to my CNN MODEL. Based on the gesture my CNN model will reply.

Wrote this code where I managed to create a 300x300 square ( my ROI ), but how do I use that region of interest to feed it to my CNN model? I want only that square part to be as input to my model.

import traceback
import cv2
import numpy as np
import math

cam = cv2.VideoCapture(0)

while(1):
    try:
        ret, frame = cam.read()
        frame = cv2.flip(frame,1)
        cv2.rectangle(frame,(200,100),(500,400),(0,255,0),2) 
        cv2.imshow('curFrame',frame)


        if cv2.waitKey(1) & 0xFF == ord('q'):
            break



    except Exception:
        traceback.print_exc()
        pass        

cam.release()
cv2.destroyAllWindows()

** Addidional

ROI = frame[100:200 , 100:200]

what does that line mean?

Upvotes: 5

Views: 5396

Answers (1)

api55
api55

Reputation: 11420

It is actually quite simple to create a ROI from a frame, and basically you already wrote it there at the end (ROI = frame[100:200 , 100:200]).

Lets assume this is your hand with your ROI after you did the code above (image from the internet):

enter image description here

Now if you want what it is inside the ROI as another image, you can use:

ROI = frame[100:400, 200:500] # according to the coordinates of your rectangle

However this will result in also having the rectangle visible in the image (see image below), so you need to actually create a copy from the original image.

Here is what it looks without copying from the original:

enter image description here

Also, some algorithms behave kind of weird with this numpy slice view, so better to do a copy. The the code should look like this in the end:

import cv2
import numpy as np

cam = cv2.VideoCapture(0)

if not cam.isOpened():
  print ("Could not open cam")
  exit()

while(1):
    ret, frame = cam.read()
    if ret:
        frame = cv2.flip(frame,1)
        display = cv2.rectangle(frame.copy(),(200,100),(500,400),(0,255,0),2) 
        cv2.imshow('curFrame',display)
        ROI = frame[100:400, 200:500].copy()
        cv2.imshow('Current Roi', ROI)

    if cv2.waitKey(10) & 0xFF == ord('q'):
        break

cam.release()
cv2.destroyAllWindows()

Notice that I add the checks for cam is open and the ret. Which is will tell you if there is any problem opening the webcam or if the image could not be read.

This will be the resulting image in ROI:

enter image description here

This can be saved with cv2.imwrite or passed to any other algorithm you have. If you have any questions feel free to ask.

Upvotes: 5

Related Questions