Dan Beri
Dan Beri

Reputation: 83

The role / effect of Color-Information on CNN

as part of a project I would like to investigate the role / effect of colors on a CNN. Unfortunately, I have found little information so far and would like to hear from you if you have any literature suggestions for me.

Basically I would like to investigate how, why and what influence colors have in CNNs. Why should I use an image with 3 channels and not only an image with only one channel.

Furthermore, I would like to investigate what influence color spaces have, I have found a paper, but maybe one or the other knows other useful literature.

Do you have an idea how I can best do this investigation?

I have thought about the following:

  1. train a CNN (e.g. VGG16) with a RGB dataset.
  2. train the same dataset in grayscale.
  3. compare the performance, filters (don't know if this is useful), compare the featuremaps.

--

For the second question (color spaces) I would proceed analogously.

  1. train a CNN with a RGB dataset.
  2. train a CNN with a HSV dataset etc.

Am I on the right track? Do you have any suggestions on how it would work better?

I would be very happy about answers. Thanks to all, Dan

Upvotes: 3

Views: 1548

Answers (1)

Abhi25t
Abhi25t

Reputation: 4643

The invariance of the CNN to an artifact is derived from your data. The CNN only has the data to learn if color is a decisive factor for recognizing an object or not.

Suppose you want to identify numbers in MNIST image dataset. For a number '8', the color has no semantic meaning -- an '8' is an '8' whether it is red or green. If you only present the CNN with red '8's, it will learn that red is a decisive factor for recognizing the '8'. By presenting it with a large number of different '8's that are colored differently, the CNN will learn that color has little influence in recognizing an '8'. The weight of the red channels or red features will not be dominant. Since the color is unlikely to give any performance boost, we can transform the images to grayscale, and we would expect minimal performance change.

But ImageNet dataset primarily has natural images, where color plays a semantic role. A cat, for instance, may be white, black or brown. You are never going to see a green or red cat. And a yellowish cat-like organism might be a lion/tiger/leopard etc. For natural images, color gives you extra information, and transforming images to grayscale may hurt performance.

Regarding color-spaces, if a color-space can be converted through equations, the CNN can learn the conversion equation, so the-color space change won't have any effect. But in YUV color-space, which separates out the luminescence (Y component) from the color components (U and V). The luminescence is less important for recognition, since it depends more on the light source and less on the object properties, while U and V components are more relevant.

This book chapter (Link) might give you further insights.

Also check-out:

"Effect of image colourspace on performance of convolution neural networks" by K Sumanth Reddy; Upasna Singh; Prakash K Uttam. Published in: 2017 2nd IEEE International Conference on Recent Trends in Electronics, Information & Communication Technology (RTEICT).(https://ieeexplore.ieee.org/document/8256949)

The authors investigated the effect of different colorspaces (RGB, HSL, HSV, LUV, YUV) on the performance of a AlexNet CNN trained with CIFAR10 dataset.

There is also a thesis on related topic: Link

Upvotes: 7

Related Questions