Chronicle
Chronicle

Reputation: 1645

Find wrong colored pixels between boundaries

In an image I have a large number of cells of various colors separated by black boundaries. However, the boundaries were not drawn perfectly, and now some cells have a handful of pixels of the wrong color (every cell should contain only 1 color).

In the following image, I have encircled the pixels that are the wrong color. The blue pixels encircled in the top-left should be grey, and the grey pixels encircled in the other three spots should be blue.

cells

Question: How do I find the wrong colored pixels in order to replace them with the right color?

Currently I am using Python and NumPy to load images into an array and with a double for-loop going row by column checking every pixel.

My current method involves for every pixel checking the pixels that directly border it (row +1, row -1, column +1 and column -1). If these are a different non-black color, I check that pixel's bordering pixels, and if their color is different from the original pixel, then I change the color of the original pixel.

However, it doesn't work correctly when there are more than one incorrect pixel next to each other, leading to the following image:

wrong

Below is the script I use. I am looking for either a way to improve it, or a different algorithm altogether. The image required by the code is right below it. I have already matched its name in the code to the name stackoverflow gave it.

import Image
import numpy as np

BLACK = (0,0,0)

im = Image.open("3gOg0.png").convert('RGB')
im.load()
im_array = np.asarray(im, dtype="int32")
(height, width, dim) = im_array.shape
newim_array = np.array(im_array)

for row in range(height):
    for col in range(width):
        rgb = tuple(im_array[row,col])
        if rgb == BLACK:
            continue

        n = tuple(im_array[row-1,col])
        s = tuple(im_array[row+1,col])
        e = tuple(im_array[row,col+1])
        w = tuple(im_array[row,col-1])

        if n != BLACK and n != rgb:
            nn = tuple(im_array[row-2,col])
            ne = tuple(im_array[row-1,col+1])
            nw = tuple(im_array[row-1,col-1])
            if (nn != BLACK and nn != rgb) or (nw != BLACK and nw != rgb) or (ne != BLACK and ne != rgb):
                newim_array[row,col] = n
                continue

        if s != BLACK and s != rgb:
            ss = tuple(im_array[row+2,col])
            se = tuple(im_array[row+1,col+1])
            sw = tuple(im_array[row+1,col-1])
            if (ss != BLACK and ss != rgb) or (sw != BLACK and sw != rgb) or (se != BLACK and se != rgb):
                newim_array[row,col] = s
                continue

        if e != BLACK and e != rgb:
            ee = tuple(im_array[row,col+2])
            ne = tuple(im_array[row-1,col+1])
            se = tuple(im_array[row+1,col+1])
            if (ee != BLACK and ee != rgb) or (se != BLACK and se != rgb) or (ne != BLACK and ne != rgb):
                newim_array[row,col] = e
                continue

        if w != BLACK and w != rgb:
            ww = tuple(im_array[row,col-2])
            nw = tuple(im_array[row-1,col-1])
            sw = tuple(im_array[row+1,col-1])
            if (ww != BLACK and ww != rgb) or (nw != BLACK and nw != rgb) or (sw != BLACK and sw != rgb):
                newim_array[row,col] = w

im2 = Image.fromarray(np.uint8(newim_array))
im2.save("fix.png")

This is the example image in correct non-zoomed size:

enter image description here

Upvotes: 1

Views: 344

Answers (2)

Aaron
Aaron

Reputation: 11075

I would take a connected component labeling approach.. although there are many ways to skin a cat..

  1. Extract only the black lines for your connected component mask
  2. Find the 4-connected regions of white (connected component labeling)
  3. For each connected region find the most often occurring color
  4. Set that entire region to that one color

Example implementation:

import numpy as np
from scipy import ndimage
from scipy import stats

#input array assuming 0 for black 1 for blue and 2 for purple
arr = np.array(...)

labeled, labels = ndimage.measurements.label(arr != 0, #connect non-black regions
                                             structure=[[0,1,0],
                                                        [1,1,1],
                                                        [0,1,0]]) #this is the default, but we'll specify it explicitly anyway...

for labelnum in range(1, labels+1):
    region = arr[np.where(labeled==labelnum)] #get a flat list of the members of that region
    mode = stats.mode(region) #find the most occurring color
    arr[np.where(labeled==labelnum)] = mode #set that color to all pixels in that region

Upvotes: 1

Scott Hunter
Scott Hunter

Reputation: 49893

Sounds like you have 2 issues:

  1. What are the regions?
  2. What color should each be?

To find the regions, and fill each with what is the most common color within it currently:

For each non-black pixel not visited yet:
    Start a new region; initialize a counter for each color
    Recursively:
        Mark the pixel as in-region
        Increment the counter for that color
        Visit each of the adjacent pixels that are not black nor in-region
    When done, 
        Color all of the in-region pixels to the color with the highest count, and mark them as visited

Upvotes: 2

Related Questions