Reputation: 3862
Suppose I have a binary image, represented as a numpy
matrix where a pixel is either background (0) or foreground (1). I'm looking for a way, to delete all pixels in the foreground, that don't have any nearest neighbour.
Suppose, the image matrix is:
a = np.array([[0,0,1,1],[1,0,0,0]])
The resulting image after single pixel deletion should be
b = np.array([[0,0,1,1],[0,0,0,0]])
My approach so far is doing a combination of openings for all possible directions:
opening1 = ndi.morphology.binary_opening(edge, structure=np.array([[0,1,0],[0,1,0],[0,0,0]]))
opening2 = ndi.morphology.binary_opening(edge, structure=np.array([[0,0,0],[0,1,1],[0,0,0]]))
opening3 = ndi.morphology.binary_opening(edge, structure=np.array([[1,0,0],[0,1,0],[0,0,0]]))
opening4 = ndi.morphology.binary_opening(edge, structure=np.array([[0,0,0],[0,1,0],[0,0,1]]))
opening = opening1 + opening2 + opening3 + opening4
An alternative way would be labeling connected components and delete them by index, however those solutions feel sub-optimal when it comes to computational complexity.
Upvotes: 0
Views: 1253
Reputation: 1
The following script seems to work. It removes all islands of size 1, then all islands up to size 3. It could be used to remove larger islands too, but removing islands of size 1 is enough to answer your question. It removes islands of both colors, so not only will you get rid of dirt around the text, but also inside the letters. The script considers pixels with touching corners to be adjacent (remove a few lines of code to only count touching sides as adjacent).
The script loops over all the rows and for each row, it loops over each pixel from left to right and checks if it is an island.
The reason to remove smaller islands first, and then larger, is a case like this:
□□□□
□□■■
□□■□
When searching for islands of size 1, it will find the white size 1 island in the bottom right corner and make it black. Then it will not find any more island.
If it just searched for islands up to size 3 immediately, it would find the black island of size 3 and make it white.
The script assumes that the first command line parameter is the name of a file that numpy can import into an array. The script also assumes that each array element is 0 or 1. The result is written to 'out.tif' (overwriting it if it exists).
Note that I don't normally use Python and the script is not optimized at all. I tried it on a TIFF originating from a scanned A4 page in 300 DPI. It took a while, but the script is already very worth using. Before optimizing it, test cases should be created to detect regressions.
The value 7 is just used during the checking and could be any value of the data type other than 0 or 1.
#! /usr/bin/python
from PIL import Image
import sys
import numpy
imarray = numpy.array(Image.open(sys.argv[1]))
h = imarray.shape[0]
w = imarray.shape[1]
markColor = 7
maxSize = 0
def isIsland(color, i, j):
global maxSize
if color != imarray[i, j]:
return True
if 0 == maxSize:
return False
imarray[i, j] = markColor
maxSize -= 1
if 0 < i:
if 0 < j:
if not isIsland(color, i - 1, j - 1):
return False
if not isIsland(color, i - 1, j):
return False
if j + 1 < w:
if not isIsland(color, i - 1, j + 1):
return False
if 0 < j:
if not isIsland(color, i, j - 1):
return False
if j + 1 < w:
if not isIsland(color, i, j + 1):
return False
if i + 1 < h:
if 0 < j:
if not isIsland(color, i + 1, j - 1):
return False
if not isIsland(color, i + 1, j):
return False
if j + 1 < w:
if not isIsland(color, i + 1, j + 1):
return False
return True
def fill(color, i, j):
if markColor != imarray[i, j]:
return
imarray[i, j] = color
if 0 < i:
if 0 < j:
fill(color, i - 1, j - 1)
fill(color, i - 1, j)
if j + 1 < w:
fill(color, i - 1, j + 1)
if 0 < j:
fill(color, i, j - 1)
if j + 1 < w:
fill(color, i, j + 1)
if i + 1 < h:
if 0 < j:
fill(color, i + 1, j - 1)
fill(color, i + 1, j)
if j + 1 < w:
fill(color, i + 1, j + 1)
for s in [1, 3]:
for i in range(h):
for j in range(w):
islandColor = imarray[i, j];
maxSize = s
if isIsland(islandColor, i, j):
fill((islandColor + 1) % 2, i, j)
else:
fill(islandColor, i, j)
Image.fromarray(imarray).save('out.tif')
Upvotes: 0
Reputation: 446
What about this?
The idea is to create a shift of the image by one pixel in each direction and then to determine if there is a neighbor by looking at a correspondance in any shift.
import numpy as np
a = np.array([[0,0,1,1],[1,0,0,0]])
print(a)
shift_bottom = np.roll(a, -1, axis=0)
shift_top = np.roll(a, 1, axis=0)
shift_right= np.roll(a, 1, axis=1)
shift_left= np.roll(a, -1, axis=1)
b = (a * shift_bottom) | (a * shift_top) | (a * shift_right) | (a * shift_left)
print(b)
Upvotes: 0