Reputation: 387
I have an image that contains black pixels. It can be vertical lines but also simple points. I would like to replace these pixels with the average of neighboring pixels (left and right).
The left and right neighbors of a black pixel all have a different value from black.
For now I have this:
import numpy as np
from matplotlib import pyplot as plt
import time
#Creating test img
test_img = np.full((2048, 2048, 3), dtype = np.uint8, fill_value = (255,127,127))
#Draw vertical black line
test_img[500:1000,1500::12] = (0,0,0)
test_img[1000:1500,1000::24] = (0,0,0)
#Draw black point
test_img[250,250] = (0,0,0)
test_img[300,300] = (0,0,0)
#Fill hole functions
def fill_hole(img):
#Find coords of black pixek
imggray = img[:,:,0]
coords = np.column_stack(np.where(imggray < 1))
print(len(coords))
#Return if no black pixel
if len(coords) == 0:
return img
percent_black = len(coords)/(img.shape[0]*img.shape[1]) * 100
print(percent_black)
#Make a copy of input image
out = np.copy(img)
#Iterate on all black pixels
for p in coords:
#Edge management
if p[0] < 1 or p[0] > img.shape[0] - 1 or p[1] < 1 or p[1] > img.shape[1] - 1:
continue
#Get left and right of each pixel
left = img[p[0], p[1] - 1]
right = img[p[0], p[1] + 1]
#Get new pixel value
r = int((int(left[0])+int(right[0])))/2
g = int((int(left[1])+int(right[1])))/2
b = int((int(left[2])+int(right[2])))/2
out[p[0],p[1]] = [r,g,b]
return out
#Function call
start = time.time()
img = fill_hole(test_img)
end = time.time()
print(end - start)
This code works fine on my example but the loop over the list of black pixels takes time depending on its size.
Is there a way to optimize this?
Upvotes: 3
Views: 4133
Reputation: 207405
Note that I have added a significantly faster implementation with numba
at the end of the answer.
I wanted to be sure of this working properly with a trickier image than your plain peach background, so I created this
Note that this is just a nasty, inaccurate JPEG representation because the original image is too large for imgur.
Then I ran this code:
#!/usr/bin/env python3
import cv2
import numpy as np
# Load image 2048x2048 RGB
im = cv2.imread('start.png')
# Make mask of black pixels, True where black
blackMask = np.all(im==0, axis=-1)
cv2.imwrite('DEBUG-blackMask.png', (blackMask*255).astype(np.uint8))
# Convolve with [0.5, 0, 0.5] to set each pixel to average of its left and right neighbours
kernel = np.array([0.5, 0, 0.5], dtype=float).reshape(1,-1)
print(kernel.shape)
convolved = cv2.filter2D(im, ddepth=-1, kernel=kernel, borderType=cv2.BORDER_REPLICATE)
cv2.imwrite('DEBUG-convolved.png', convolved)
# Choose either convolved or original image at each pixel
res = np.where(blackMask[...,None], convolved, im)
cv2.imwrite('result.png', res)
And the result is (yet another nasty, resized JPEG):
The timings are here, and could probably be improved further - not sure what timings your code achieved or what you need:
In [55]: %timeit blackMask = np.all(im==0, axis=-1)
22.3 ms ± 29.1 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [56]: %timeit convolved = cv2.filter2D(im, ddepth=-1, kernel=kernel, borderType=cv2.BORDER_REPLICATE
...: )
2.66 ms ± 3.07 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [57]: %timeit res = np.where(blackMask[...,None], convolved, im)
22.7 ms ± 76.2 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
So around 46ms in-toto. Note that you can comment out all lines that create output images called DEBUG-xxx.png
since they are just for debug and named like that so I can easily clean up after testing.
I think this would run really nicely under numba
but currently llvmlite
is not supported on my M1 Mac so I can' try it. Here is something similar with numba
.
Optimisations
I had a think about some optimisations to the code above. It seems that two aspects are slower than they might be - making the mask and inserting the convolved values into the array. So, looking first at making the mask, I originally did:
blackMask = np.all(im==0, axis=-1)
and that took 22ms. I tried that with numexpr
like this:
import numexpr as ne
R=im[...,0]
G=im[...,1]
B=im[...,2]
blackMask = ne.evaluate('(R==0)&(G==0)&(B==0)')
and that gets the same result but only takes 1.88ms instead of 22ms so a useful saving of 20ms.
As regards the third part, inserting convolved values into output array, I found I can do that usefully faster with numexpr
too.
So, instead of:
res = np.where(blackMask[...,None], convolved, im)
I used:
blackMask3 = np.dstack((blackMask, blackMask, blackMask))
res = ne.evaluate("where(blackMask3, convolved, im)")
That reduced the time from 22ms to 6ms on my machine. So the total time is now reduced from 46ms to 10.5ms (1.88ms + 2.66ms + 6ms).
I remained convinced that this could be done with Numba
significantly faster as it really falls in Numba
's sweet spot with a large image and parallelisable code. I couldn't install Numba
on my M1 Mac though, so I found a VERY LOWLY Intel Celeron where Numba
could be installed and ran the following code.
The low-spec £200 Intel Celeron machine (4-cores, 8GB DDR4 RAM, eMMC disk) beat the £5,000 M1 Mac (12-core, 32GB DDR5 RAM, NVMe SSD) by a factor of 3, coming in at just over 3ms:
#!/usr/bin/env python3
import cv2
import numpy as np
import numba as nb
@nb.jit('void(uint8[:,:,::3])', parallel=True)
def removeLines(im):
# Ensure image is 3-channel
assert (im.ndim == 3) and (im.shape[2] == 3)
h, w = im.shape[0], im.shape[1]
for y in nb.prange(h):
for x in range(1,w-1):
# Check if black, ignore if not
sum = im[y,x,0] + im[y,x,1] + im[y,x,2]
if sum != 0: continue
# Pixel is black.
# Replace with mean of left and right neighbours in all channels
im[y, x, 0] = im[y, x-1, 0] // 2 + im[y, x+1, 0] // 2
im[y, x, 1] = im[y, x-1, 1] // 2 + im[y, x+1, 1] // 2
im[y, x, 2] = im[y, x-1, 2] // 2 + im[y, x+1, 2] // 2
return
# Load image
im = cv2.imread('start.png')
removeLines(im)
cv2.imwrite('result.png', im)
Upvotes: 3
Reputation: 877
In general, for loops on numpy arrays usually cause slowing things, and in most of cases can be avoided by numpy built-in functions. In your case, consider using convolution on the image, see as a reference: Python get get average of neighbours in matrix with na value
Upvotes: 4
Reputation: 1287
You always apply the same operation on a black pixel. So it's highly parallelizable . Divide your image in smaller rectangles and put threads and/or processes to act on each small rectangle. You can try to tweak the rectangle size to get the best performance.
Also, since your black pixels structure is very particular, you can implement some strategy to avoid checking some image regions that you are sure have no black pixels. An idea can be to check, for each image column, some random pixels in the beginning, middle and end of the column. Then, ignore the column if you find no black pixels in the check.
Upvotes: 2