What RGB format do tensorflow image ops expect?

Question

How does a tensorflow image op (like nn.conv2d) expect image channels to be represented?

an array of 3 values ranging from [0-255]
an array of 3 values ranging from [0-1]
an array of 3 one-hot arrays of size 255
something else?

I'm trying to understand why my learning rate is so poor and I'm guessing it's because my input is malformed.

Vijay Mariappan · Accepted Answer

The conv2d accepts all the forms you mentioned here. It doesn't care what the input range should be, as long it is within the data-type range. But from a neural network training perspective its very important that the inputs are scaled properly. Not only with the input image, but even at each layer level we want the inputs to be scaled properly. And that why techniques like batch-normalization is present in almost all recent networks because it improves training by enabling better flow of gradients through the network. So scaling the images to [-1, +1] range (or zero mean unit variance) is important.

What RGB format do tensorflow image ops expect?

Answers (1)

Related Questions