Opencv movement detection without being trigger by random noise

I'm new to image processing and I'm struggling a bit, I'm making my own diy security software and I made a function to detect some movement in order to start recording and notify me.

The idea of this function is to take two images and diff them in order to find some movement, the problem I have is that either :

  1. The detection is working really fine but at night there's some noise on the image and even daily in the shadows which trigger a positive detection wrongly
  2. The function is not wrongly triggered but it miss some detections

The way I tried the option 2 is through the commented code, main ideas was to

Here is my code :

import cv2
import numpy as np
from skimage.metrics import structural_similarity as ssim

def count_diff_nb(img_1, img_2):

    # resize images
    img_1_height, img_1_width = img_1.shape[:2]
    new_height = int((600 / img_1_width) * img_1_height)

    img_1 = cv2.resize(img_1, (600,new_height))
    img_2 = cv2.resize(img_2, (600,new_height))

    # convert to gray scale
    gray_image1 = cv2.cvtColor(img_1, cv2.COLOR_BGR2GRAY)
    gray_image2 = cv2.cvtColor(img_2, cv2.COLOR_BGR2GRAY)

    # Gaussian blur in order to remove some noise
    blur1 = cv2.GaussianBlur(gray_image1, (5,5), 0)
    blur2 = cv2.GaussianBlur(gray_image2, (5,5), 0)

    # divide (bad idea)
    #divide1 = cv2.divide(gray_image1, blur1, scale=255)
    #divide2 = cv2.divide(gray_image2, blur2, scale=255)

    # Compute SSIM between two images
    #ssim_value, diff = ssim(gray_image1, gray_image2, full=True)
    ssim_value, diff = ssim(blur1, blur2, full=True)
    #ssim_value, diff = ssim(divide1, divide2, full=True)
    diff_percent = (1 - ssim_value) * 100

    # The diff image contains the actual image differences between the two images
    # and is represented as a floating point data type so we must convert the array 
    # to 8-bit unsigned integers in the range [0,255] before we can use it with OpenCV
    diff = (diff * 255).astype("uint8")

    # Adaptative threshold (bad idea too)
    #thresh = cv2.adaptiveThreshold(diff, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY_INV, 11, 2)
    #thresh = cv2.adaptiveThreshold(diff, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY_INV, 3, 10)

    # Threshold the difference image
    thresh = cv2.threshold(diff, 0, 255, cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)[1]

    # followed by finding contours to
    # obtain the regions that differ between the two images
    contours = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
    contours = contours[0] if len(contours) == 2 else contours[1]
    # Highlight differences
    mask = np.zeros(img_1.shape, dtype='uint8')
    filled = img_2.copy()

    contours_nb = 0
    for c in contours:
        # limit is an area so sqrt of size
        area = cv2.contourArea(c)
        # 72000 is 1/3 of global img area
        if area > 2000 and area < 72000:
            contours_nb = contours_nb + 1
            x,y,w,h = cv2.boundingRect(c)
            cv2.rectangle(img_1, (x, y), (x + w, y + h), (36,255,12), 2)
            cv2.rectangle(img_2, (x, y), (x + w, y + h), (36,255,12), 2)
            cv2.drawContours(mask, [c], 0, (0,255,0), -1)
            cv2.drawContours(filled, [c], 0, (0,255,0), -1)

    return contours_nb, diff_percent, img_2, filled

Do you have any ideas or things I'm missing in order to be able to find the sweetspot between sensibility (not miss detections) and ignoring random noise due to the darkness ?

I thought to ignore the dark colors before converting to grayscale but if the moving thing is black then .. it's a bad idea I think.

Thanks a lot !

Edit :

I changed the whole thing by implementing this solution suggested by @pippo1980. I use BackgroundSubtractorMOG2 which works the best in my case. (I tested the different options).

So it works almost perfectly, the last pain point is now at the sunrise and sunset, when my cheap webcam is struggling with noise and the image is a little blur / randomly noised.

I'm searching how to deal with this but I'm not sure.

Here's when it's working fine, you can see that the mask is really sharp :

enter image description here

enter image description here

And at sunset with the blur / noise on image :

enter image description here

enter image description here

enter image description here

enter image description here

enter image description here

enter image description here

I don't have any idea about what you are doing wrong but googling a bit you could find a lot of approaches. For example stolen from Moving Object Detection with OpenCV using Contour Detection and Background Subtraction, you could find a nice Flowchart of Object Detection Pipeline using OpenCV:

enter image description here

That mentions Background subtraction, not described in your algorithm, but I could be wrong I can't read OpenCV by earth. In the docs they describe one of this methods as:

Every frame is used both for calculating the foreground mask and for updating the background. If you want to change the learning rate used for updating the background model, it is possible to set a specific learning rate by passing a parameter to the apply method.....

And you could actually find about this method on OpenCV docs :

enter image description here

Apparently there are two of them BackgroundSubtractorMOG and BackgroundSubtractorMOG2.

They are actually 3 described here on SO too:

Differences between MOG, MOG2, and GMG

I don't have your input so tried to experiment using as input "inp_short_2.mp4":

Here's an image preview:

enter image description here

With code:

import cv2
import numpy as np
import matplotlib.pyplot as plt

def ResizeWithAspectRatio(image, width=None, height=None, inter=cv2.INTER_AREA):
    dim = None
    (h, w) = image.shape[:2]

    if width is None and height is None:
        return image
    if width is None:
        r = height / float(h)
        dim = (int(w * r), height)
        r = width / float(w)
        dim = (width, int(h * r))

    return cv2.resize(image, dim, interpolation=inter)

draw_windows = True  ## change fo False for no windows only calc

def drawWindow(window_name, image):
    if draw_windows:
        resize = ResizeWithAspectRatio(image, width= 1000)
        cv2.imshow(window_name, resize)
        cv2.moveWindow(window_name, 600, 200)
# vid_path = ('inp.mp4')
# vid_path = ('inp_short.mp4')
vid_path = ('inp_short_2.mp4')

cap = cv2.VideoCapture(vid_path)

backSub = cv2.createBackgroundSubtractorMOG2()



# cv2.imshow('backSub_1st' , backSub)

# cv2.waitKey(0)

if not cap.isOpened():
    print("Error opening video file")
total_contours = 0

total_frames = 0

frames_out_list = []
while cap.isOpened():
    # print('cap.isOpened()' , cap.isOpened())
    # Capture frame-by-frame
    ret, frame = cap.read()
    total_frames += 1
    if ret:
        frame_copy = frame.copy()
        # print('ret : ' , ret)
        # Apply background subtraction
        fg_mask = backSub.apply(cv2.cvtColor(frame, cv2.COLOR_RGB2GRAY))
        # print('fg_mask : ',  np.sum(fg_mask))
        # drawWindow('fg_mask', fg_mask)

        # apply global threshold to remove shadows
        retval, mask_thresh = cv2.threshold( fg_mask, 180, 255, cv2.THRESH_BINARY)

        # mask_thresh = fg_mask

        # set the kernal
        # kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (3,3))
        # Apply erosion
        # mask_eroded = cv2.morphologyEx(mask_thresh, cv2.MORPH_OPEN, kernel)
        # Apply morphological operations to reduce noise and fill gaps
        kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (5, 5))
        # mask_eroded = mask_thresh
        mask_eroded  = cv2.erode(mask_thresh, kernel, iterations=1)
        mask_eroded  = cv2.dilate(mask_thresh, kernel, iterations=1)

        min_contour_area = 500  # Define your minimum area threshold
        # Find contours
        contours, hierarchy = cv2.findContours(mask_eroded, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
        # print(contours)
        frame_ct = cv2.drawContours(frame, contours, -1, (0, 255, 0), 2)
        # Display the resulting frame
        # cv2.imshow('Frame_final', frame_ct)
        # cv2.waitKey(0)
        large_contours = [cnt for cnt in contours if cv2.contourArea(cnt) > min_contour_area ] #and cv2.contourArea(cnt) < 500000]

        #large_contours = [cnt for cnt in contours if cv2.contourArea(contours) > min_contour_area]

        frame_out = frame.copy()

        for cnt in large_contours:
            frame_ct = cv2.drawContours(frame, cnt, -1, (0, 255, 0), thickness = cv2.FILLED)
            total_contours += 1
            x, y, w, h = cv2.boundingRect(cnt)
            frame_out = cv2.rectangle(frame, (x, y), (x+w, y+h), (0, 0, 200), 3)
            # Display the resulting frame
            # drawWindow('Frame_final', frame_out)
            print('fg_mask : ',  np.sum(fg_mask))
            # drawWindow('fg_mask', fg_mask)


    if total_contours > 100:
print('total_contours :' ,total_contours,'/', total_frames)

print('mask size : ', fg_mask.shape, fg_mask.size)

for contour in frames_out_list :
    x, y, w, h = cv2.boundingRect(contour)
    frame_out = cv2.rectangle(frame_copy, (x, y), (x+w, y+h), (0, 0, 200), 3)

drawWindow('Frame_final', frame_copy)

I get these boxes as detected while running the video:

enter image description here

While using this other input:


Example pics:

enter image description here


# set the kernal
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (3,3))
# Apply erosion
mask_eroded = cv2.morphologyEx(mask_thresh, cv2.MORPH_OPEN, kernel)

instead of the mask operations used in the above code I get:

enter image description here

That is a kind of noise but seems not to bad for an image full of lights and branches that move a lot.

This was my first try of such OpenCV capabilities. I am not an expert but I would says it's not bad what you can get using a nice library and a couple of loops, without an input and complete code we cannot really comment on your results/question.

So here's my full solution at the moment, it's working correctly except when sun is coming out quickly and there's bright surfaces.

image_pre_processing :

import cv2

def erode_and_contours(fg_mask, frame):
    retval, mask_thresh = cv2.threshold(fg_mask, 180, 255, cv2.THRESH_BINARY)

    # erosion and dilation
    # set the kernel
    kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (3 ,3))
    # apply erosion
    mask_eroded = cv2.morphologyEx(mask_thresh, cv2.MORPH_OPEN, kernel)

    # Find contours
    contours, hierarchy = cv2.findContours(mask_eroded, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)

    # Filtering contours
    min_contour_area = 4000
    large_contours = [cnt for cnt in contours if cv2.contourArea(cnt) > min_contour_area]
    # Draw bounding boxes
    frame_out = frame.copy()
    for cnt in large_contours:
        x, y, w, h = cv2.boundingRect(cnt)
        frame_out = cv2.rectangle(frame, (x, y), (x+w, y+h), (0, 200, 0), 3)

    return large_contours, mask_eroded, frame_out

def extract_contours(contours):
    # sort contours by x and y 
    sorted_contours = sorted(contours, key=lambda c: cv2.boundingRect(c)[:2])
    # extract rectangles under form [x, y, width, height]
    rectangles = [[cv2.boundingRect(c)[0], cv2.boundingRect(c)[1], cv2.boundingRect(c)[2], cv2.boundingRect(c)[3]] for c in sorted_contours]

    return rectangles

def check_contours_movement(stored_positions_list, movement_threshold_percent):
    # return True if contours movement is > to movement_threshold_percent 

    if len(stored_positions_list) < 2:
        return False, 0  # Not enough positions to compare

    # search the min len
    min_len = None
    for positions_list in stored_positions_list:
        if min_len is None:
            min_len = len(positions_list)

        if min_len > len(positions_list):
            min_len = len(positions_list)

    first_positions = stored_positions_list[0]
    for positions_list in stored_positions_list[1:]:
        for i in range(min_len):
            x_diff = calc_diff_percent(first_positions[i][0], positions_list[i][0])
            y_diff = calc_diff_percent(first_positions[i][1], positions_list[i][1])
            w_diff = calc_diff_percent(first_positions[i][2], positions_list[i][2])
            h_diff = calc_diff_percent(first_positions[i][3], positions_list[i][3])

            mean_diff = (x_diff + y_diff + w_diff + h_diff) / 4
            if mean_diff > movement_threshold_percent:
                return True, mean_diff

    return False, 0

def calc_diff_percent(nb1, nb2):
    nb1 = max(nb1, 1)
    nb2 = max(nb2, 1)
    if nb2 < nb1:
        nb1, nb2 = nb2, nb1
    res = (nb2 - nb1) / nb1 * 100
    return res

main loop :

import cv2
import time
from image_pre_processing import erode_and_contours, extract_contours, check_contours_movement

def main(camera_index):
        camera = cv2.VideoCapture(camera_index)
        if not camera.isOpened():
            print("Error : impossible to open camera feed.")

        init_phase = 0
        is_registering = False
        last_send_time = time.time()

        movement_start_time = None
        movement_duration_threshold = 2
        update_positions_interval = 0.5
        update_positions_time = time.time()

        backSub = cv2.createBackgroundSubtractorMOG2()

        stored_contours_positions = []

        while True:
            ret, frame = camera.read()

            denoised = cv2.fastNlMeansDenoisingColored(frame, None, 5, 5, 3, 9)
            gray = cv2.cvtColor(denoised, cv2.COLOR_BGR2GRAY)
            fg_mask = backSub.apply(gray)
            cv2.imshow('gray', gray)
            if init_phase > 10:
                large_contours, mask_eroded, frame_out = erode_and_contours(fg_mask, frame)

                # movement identified
                nb_diff = len(large_contours)
                if nb_diff > 0:
                    # If begin of a movement we start to count
                    if movement_start_time is None:
                        movement_start_time = time.time()

                    # temporize and register contours position
                    if (time.time() - update_positions_time) >= update_positions_interval:
                        update_positions_time = time.time()

                    contours_are_moving, mean_position_diff = check_contours_movement(stored_contours_positions, 30)

                    # if movement duration is > to threshold
                    if (time.time() - movement_start_time) >= movement_duration_threshold and contours_are_moving is True:
                        timing = time.time() - movement_start_time
                        # Send only 1 image per second
                        if  (time.time() - last_send_time) >= 1:
                            if is_registering is False:
                                is_registering = True
                                    print(f"There's some activity on camera. nb_diff : {nb_diff}, for {timing:.2f}s and {mean_position_diff:.2f}% moving")
                            last_send_time = time.time()
                    # if no movement we re-init
                    is_registering = False
                    movement_start_time = None
                    stored_contours_positions = []
                init_phase = init_phase + 1

            # temporize to not overload


So as I'm having issues with the algo detecting movement when sun burst and I have shiny spot not moving I'm thinking about two options :

  1. Search about the histogram processing
  2. use tensorflow with yolo to check if I detect some humans or other items on the detections (but maybe overkill and too much cpu consumption as it runs on a raspberry pi)

