Reputation: 655

How to convert 3D RGB label image (in semantic segmentation) to 2D gray image, and class indices start from 0?

I have a rgb semantic segmentation label, if there exists 3 classes in it, and each RGB value is one of:

[255, 255, 0], [0, 255, 255], [255, 255, 255]

respectively, then I want to map all values in RGB file into a new 2D label image according to the dict:

{(255, 255, 0): 0, (0, 255, 255): 1, (255, 255, 255): 2}

after that, all values in the new gray label file is one of 0, 1 or 2. Is there an efficient way to solve this problem? For example broadcasting in NumPy.

Upvotes: 7

Answers (4)

Mendrika

Reputation: 31

I've also answered this question here: Convert RGB image to index image

Basically:

cmap = {(255, 255, 0): 0, (0, 255, 255): 1, (255, 255, 255): 2}

def rgb2mask(img):

    assert len(img.shape) == 3
    height, width, ch = img.shape
    assert ch == 3

    W = np.power(256, [[0],[1],[2]])

    img_id = img.dot(W).squeeze(-1) 
    values = np.unique(img_id)

    mask = np.zeros(img_id.shape)

    for c in enumerate(values):
        try:
            mask[img_id==c] = cmap[tuple(img[img_id==c][0])] 
        except:
            pass
    return mask

You can extend extend the dictionary as you want.

Upvotes: 1

Mark Setchell

Reputation: 207560

I had a try at this...

First off, I noticed that in the following table of RGB values, the Green values are all the same so there is no point checking them.

Secondly, if you divide the values in the array by 255, you get zeroes and ones which are very close to the labelling you need. So, if you do a little maths:

t = R/255 + 2B/255 -1

then you get this for the values in the dictionary:

  R   G   B    t
==================
255 255   0    0
 0  255 255    1
255 255 255    2

The code to compare with a couple of other answers looks like this:

#!/usr/bin/env python3

import numpy as np

def me(img): 
    """Return R + 2B - 1 as label"""
    return np.uint8((img[:,:,0]/255) + 2*(img[:,:,2]/255) - 1) 

def deepak(img):
    r = np.array([255, 255, 0])
    g = np.array([0, 255, 255])
    b = np.array([255, 255, 255])

    label_seg = np.zeros((img.shape[:2]), dtype=np.uint8)
    label_seg[(img==r).all(axis=2)] = 0
    label_seg[(img==g).all(axis=2)] = 1
    label_seg[(img==b).all(axis=2)] = 2
    return label_seg

def marios(label):
    mask_mapping = {
       (255, 255, 0):   0,
       (0, 255, 255):   1,
       (255, 255, 255): 2,
    }
    for k in mask_mapping:
        label[(label == k).all(axis=2)] = mask_mapping[k]

    return label

# Generate a sample image
img = np.zeros((480,640,3), dtype=np.uint8)
img[:160,:,:]    = [255,255,0]
img[160:320,:,:] = [0,255,255]
img[320:,:,:]    = [255,255,255]

The timings come out like this:

In [134]: %timeit deepak(img)
15.4 ms ± 181 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [135]: %timeit marios(img)
15.4 ms ± 166 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [172]: %timeit me(img)                                                                           
869 µs ± 8.93 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)each)

Whether the 18x speedup is worth the less readable code is another argument, though comments can go a good way to helping :-)

Note, in fairness to Deepak, his time can be reduced to 0 10.3ms by removing the unnecessary line below which zeroes some elements in an array of zeroes:

label_seg[(img==r).all(axis=2)] = 0

Upvotes: 0

MariosOreo

Reputation: 155

How about this one:

mask_mapping = {
    (255, 255, 0):   0,
    (0, 255, 255):   1,
    (255, 255, 255): 2,
}
for k in mask_mapping:
    label[(label == k).all(axis=2)] = mask_mapping[k]

I think it's based on the same idea as the accepted method, but it looks more clear.

Upvotes: 1

Deepak Saini

Reputation: 2910

You can do this:

# the three channels
r = np.array([255, 255, 0])
g = np.array([0, 255, 255])
b = np.array([255, 255, 255])

label_seg = np.zeros((img.shape[:2]), dtype=np.int)
label_seg[(img==r).all(axis=2)] = 0
label_seg[(img==g).all(axis=2)] = 1
label_seg[(img==b).all(axis=2)] = 2

So that, if

img = np.array([[r,g,b],[r,r,r],[b,g,r],[b,g,r]])

then,

label_seg = array([[0, 1, 2],
                   [0, 0, 0],
                   [2, 1, 0],
                   [2, 1, 0]])

Upvotes: 1

How to convert 3D RGB label image (in semantic segmentation) to 2D gray image, and class indices start from 0?

Answers (4)

Related Questions