How to get the background from multiple images by removing moving objects?

I have taken multiple images of the same scene with a fixed camera which has moving objects in it. I don't understand how can I use these images in Python to retrieve the background image by removing all the moving objects.

Any help would be appreciated. Thanks!

Images have been attached below:

In this case, I would expect the final image to be without any hands in it.

image1:

image2:

image3:

Upvotes: 3

Answers (3)

Mark Setchell

Reputation: 207678

Updated Answer

I worked out how to do what I suggested below in Python - but there may be better ways - I am still a Python beginner!

#!/usr/bin/env python3

import numpy as np
from PIL import Image

# Load images
im0 = np.array(Image.open('1.jpg'))
im1 = np.array(Image.open('2.jpg'))
im2 = np.array(Image.open('3.jpg'))

# Stack the 3 images into a 4d sequence
sequence = np.stack((im0, im1, im2), axis=3)

# Repace each pixel by mean of the sequence
result = np.median(sequence, axis=3).astype(np.uint8)

# Save to disk
Image.fromarray(result).save('result.png')

Original Answer

The easiest way is to take the median of each pixel across the 3 images because the 2 images without the hand will have values near each other and the one with the hand will be the outlier and the median filter removes outliers.

So, I don't mean you look at a 3x3 or 5x5 area of each image and calculate the median. Rather, I mean look at pixel[0,0] in image 1, 2 and 3 and take the median of those three values - you do that by sorting the 3 values into order and picking the middle value of the three sorted pixels as your output value. Then look at pixel[0,1] in all 3 images and repeat the process.

This is the result:

I didn't write the Python, I just did exactly the same thing with ImageMagick in Terminal like this:

convert 1.jpg 2.jpg 3.jpg -evaluate-sequence median result.jpg

Just so you can see what I am doing, if I calculate the mean/average of the 3 pixels at each location, rather than the median, the hands will show up at just 1/3 of their original density: