Fedour
Fedour

Reputation: 387

Python: Fast way for removing black pixel in image

I have an image that contains black pixels. It can be vertical lines but also simple points. I would like to replace these pixels with the average of neighboring pixels (left and right).

The left and right neighbors of a black pixel all have a different value from black.

enter image description here

For now I have this:

import numpy as np
from matplotlib import pyplot as plt
import time



#Creating test img
test_img = np.full((2048, 2048, 3), dtype = np.uint8, fill_value = (255,127,127))

#Draw vertical black line
test_img[500:1000,1500::12] = (0,0,0)
test_img[1000:1500,1000::24] = (0,0,0)
#Draw black point
test_img[250,250] = (0,0,0)
test_img[300,300] = (0,0,0)

#Fill hole functions
def fill_hole(img):

    #Find coords of black pixek
    imggray = img[:,:,0]

    
    coords = np.column_stack(np.where(imggray < 1))
    print(len(coords))

    #Return if no black pixel
    if len(coords) == 0:
        return img

    percent_black = len(coords)/(img.shape[0]*img.shape[1]) * 100
    print(percent_black)
    
    #Make a copy of input image
    out = np.copy(img)

    #Iterate on all black pixels
    for p in coords:

            #Edge management
            if p[0] < 1 or p[0] > img.shape[0] - 1 or p[1] < 1 or p[1] > img.shape[1] - 1:
                continue

            #Get left and right of each pixel
            left = img[p[0], p[1] - 1]
            right = img[p[0], p[1] + 1]

            #Get new pixel value
            r = int((int(left[0])+int(right[0])))/2
            g = int((int(left[1])+int(right[1])))/2
            b = int((int(left[2])+int(right[2])))/2

            out[p[0],p[1]] = [r,g,b] 
    return out

#Function call
start = time.time()
img = fill_hole(test_img)
end = time.time()
print(end - start)

This code works fine on my example but the loop over the list of black pixels takes time depending on its size.

Is there a way to optimize this?

Upvotes: 3

Views: 4133

Answers (3)

Mark Setchell
Mark Setchell

Reputation: 207405

Note that I have added a significantly faster implementation with numba at the end of the answer.

I wanted to be sure of this working properly with a trickier image than your plain peach background, so I created this

enter image description here

Note that this is just a nasty, inaccurate JPEG representation because the original image is too large for imgur.

Then I ran this code:

#!/usr/bin/env python3

import cv2
import numpy as np

# Load image 2048x2048 RGB
im = cv2.imread('start.png')

# Make mask of black pixels, True where black
blackMask = np.all(im==0, axis=-1)
cv2.imwrite('DEBUG-blackMask.png', (blackMask*255).astype(np.uint8))

# Convolve with [0.5, 0, 0.5] to set each pixel to average of its left and right neighbours
kernel = np.array([0.5, 0, 0.5], dtype=float).reshape(1,-1)
print(kernel.shape)
convolved = cv2.filter2D(im, ddepth=-1, kernel=kernel, borderType=cv2.BORDER_REPLICATE)
cv2.imwrite('DEBUG-convolved.png', convolved)

# Choose either convolved or original image at each pixel
res = np.where(blackMask[...,None], convolved, im)
cv2.imwrite('result.png', res)

And the result is (yet another nasty, resized JPEG):

enter image description here

The timings are here, and could probably be improved further - not sure what timings your code achieved or what you need:

In [55]: %timeit blackMask = np.all(im==0, axis=-1)
22.3 ms ± 29.1 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [56]: %timeit convolved = cv2.filter2D(im, ddepth=-1, kernel=kernel, borderType=cv2.BORDER_REPLICATE
    ...: )
2.66 ms ± 3.07 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [57]: %timeit res = np.where(blackMask[...,None], convolved, im)
22.7 ms ± 76.2 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

So around 46ms in-toto. Note that you can comment out all lines that create output images called DEBUG-xxx.png since they are just for debug and named like that so I can easily clean up after testing.

I think this would run really nicely under numba but currently llvmlite is not supported on my M1 Mac so I can' try it. Here is something similar with numba.


Optimisations

I had a think about some optimisations to the code above. It seems that two aspects are slower than they might be - making the mask and inserting the convolved values into the array. So, looking first at making the mask, I originally did:

blackMask = np.all(im==0, axis=-1)

and that took 22ms. I tried that with numexpr like this:

import numexpr as ne
R=im[...,0]
G=im[...,1]
B=im[...,2]
blackMask = ne.evaluate('(R==0)&(G==0)&(B==0)')

and that gets the same result but only takes 1.88ms instead of 22ms so a useful saving of 20ms.

As regards the third part, inserting convolved values into output array, I found I can do that usefully faster with numexpr too.

So, instead of:

res = np.where(blackMask[...,None], convolved, im)

I used:

blackMask3 = np.dstack((blackMask, blackMask, blackMask))
res = ne.evaluate("where(blackMask3, convolved, im)")

That reduced the time from 22ms to 6ms on my machine. So the total time is now reduced from 46ms to 10.5ms (1.88ms + 2.66ms + 6ms).


I remained convinced that this could be done with Numba significantly faster as it really falls in Numba's sweet spot with a large image and parallelisable code. I couldn't install Numba on my M1 Mac though, so I found a VERY LOWLY Intel Celeron where Numba could be installed and ran the following code.

The low-spec £200 Intel Celeron machine (4-cores, 8GB DDR4 RAM, eMMC disk) beat the £5,000 M1 Mac (12-core, 32GB DDR5 RAM, NVMe SSD) by a factor of 3, coming in at just over 3ms:

#!/usr/bin/env python3

import cv2
import numpy as np
import numba as nb

@nb.jit('void(uint8[:,:,::3])', parallel=True)
def removeLines(im):
    # Ensure image is 3-channel
    assert (im.ndim == 3) and (im.shape[2] == 3)

    h, w = im.shape[0], im.shape[1]
    for y in nb.prange(h):
        for x in range(1,w-1):
            # Check if black, ignore if not
            sum = im[y,x,0] + im[y,x,1] + im[y,x,2]
            if sum != 0: continue

            # Pixel is black.
            # Replace with mean of left and right neighbours in all channels
            im[y, x, 0] = im[y, x-1, 0] // 2 + im[y, x+1, 0] // 2
            im[y, x, 1] = im[y, x-1, 1] // 2 + im[y, x+1, 1] // 2
            im[y, x, 2] = im[y, x-1, 2] // 2 + im[y, x+1, 2] // 2
    return

# Load image
im = cv2.imread('start.png')
removeLines(im)
cv2.imwrite('result.png', im)

Upvotes: 3

idanz
idanz

Reputation: 877

In general, for loops on numpy arrays usually cause slowing things, and in most of cases can be avoided by numpy built-in functions. In your case, consider using convolution on the image, see as a reference: Python get get average of neighbours in matrix with na value

Upvotes: 4

joaopfg
joaopfg

Reputation: 1287

You always apply the same operation on a black pixel. So it's highly parallelizable . Divide your image in smaller rectangles and put threads and/or processes to act on each small rectangle. You can try to tweak the rectangle size to get the best performance.

Also, since your black pixels structure is very particular, you can implement some strategy to avoid checking some image regions that you are sure have no black pixels. An idea can be to check, for each image column, some random pixels in the beginning, middle and end of the column. Then, ignore the column if you find no black pixels in the check.

Upvotes: 2

Related Questions