mchd
mchd

Reputation: 3163

How to change the channels of a mask

I got segmentation masks for a dataset that look like this:

enter image description here

I want to change the mask file to something like this (where every class is a different shade of gray). This one is 1 channel: enter image description here

The latter mask works better with this piece of code but the dataset I want to use, has the "colorful" masks:

# CIHP has 20 labels and Headsegmentation has 14 labels

image_size = 512
batch = 4
labels = 20
data_directory = "/content/CIHP/instance-level_human_parsing/"
sample_train_images = len(os.listdir(data_directory + 'Training/Images/')) - 1
sample_validation_images = len(os.listdir(data_directory + 'Validation/Images/')) - 1
test_images = len(os.listdir('/content/headsegmentation_final/Test/')) - 1
print('Train size: ' + str(sample_train_images))
print('Validation size: ' + str(sample_validation_images))

t_images = sorted(glob(os.path.join(data_directory, "Training/Images/*")))[:sample_train_images]
t_masks = sorted(glob(os.path.join(data_directory, "Training/Category_ids/*")))[:sample_train_images]
v_images = sorted(glob(os.path.join(data_directory, "Validation/Images/*")))[:sample_validation_images]
v_masks = sorted(glob(os.path.join(data_directory, "Validation/Category_ids/*")))[:sample_validation_images]
test_images = sorted(glob(os.path.join(data_directory, "/content/headsegmentation_final/Test/*")))[:test_images]

def image_augmentation(img, random_range):
    img = tf.image.random_flip_left_right(img)
    img = tfa.image.rotate(img, random_range)

    return img

def image_process(path, mask=False):
    img = tf.io.read_file(path)

    upper = 90 * (math.pi/180.0) # degrees -> radian
    lower = 0 * (math.pi/180.0)
    ran_range = random.uniform(lower, upper)

    if mask == True:
        img = tf.image.decode_png(img, channels=1)
        img.set_shape([None, None, 1])
        img = tf.image.resize(images=img, size=[image_size, image_size])
        #img = image_augmentation(img, ran_range)

    else:
        img = tf.image.decode_jpeg(img, channels=3)
        img.set_shape([None, None, 3])
        img = tf.image.resize(images=img, size=[image_size, image_size])
        img = img / 127.5 - 1
        #img = image_augmentation(img, ran_range)

    return img

def data_loader(image_list, mask_list):
    img = image_process(image_list)
    mask = image_process(mask_list, mask=True)
    return img, mask

def data_generator(image_list, mask_list):

    cihp_dataset = tf.data.Dataset.from_tensor_slices((image_list, mask_list))
    cihp_dataset = cihp_dataset.map(data_loader, num_parallel_calls=tf.data.AUTOTUNE)
    cihp_dataset = cihp_dataset.batch(batch, drop_remainder=True)

    return cihp_dataset

train_dataset = data_generator(t_images, t_masks)
val_dataset = data_generator(v_images, v_masks)

print("Train Dataset:", train_dataset)
print("Val Dataset:", val_dataset)

Basically I want to iterate every single "colorful" mask file and change it to the latter one. I can do the iteration but I don't know how to convert mask files.

Upvotes: 1

Views: 1418

Answers (1)

Shai
Shai

Reputation: 114866

Looking at your code it seems like (and it makes a lot of sense) your label images are not really RGB images (i.e., 3 channels per-pixel), but rather single-channel indexed-RGB images:

if mask == True:
  img = tf.image.decode_png(img, channels=1)  

When you display the label image (how exactly are you doing so?) you use a color map that assigns a specific color to each label.
However, when your image_process function reads the mask image it does not return a 3-channel RGB image, but rather only the label index map, which you can treat as a gray-scale image. No conversion is needed.

Upvotes: 2

Related Questions