Understanding load_image() method in pycaffe

Question

The source description

Load an image converting from grayscale or alpha as needed.

Parameters
----------
filename : string
color : boolean
    flag for color format. True (default) loads as RGB while False
    loads as intensity (if image is already grayscale).

Returns
-------
image : an image with type np.float32 in range [0, 1]
    of size (H x W x 3) in RGB or
    of size (H x W x 1) in grayscale.

And this is an example of how to use it

input_image = 255 * caffe.io.load_image(IMAGE_FILE)

My question is if the IMAGE_FILE is RGB color with each channel 0-255 values and the return value caffe.io.load_image(IMAGE_FILE) is in range [0,1], multiplying 255, the range of each channel is still 0-255.

So what's the point to do this step?

Shai · Accepted Answer

The reasons for reading an image to float type in range [0..1] are:

Some models do not scale the input back to [0..255], but rather process the input in the range [0..1].
It is quite common when processing images to scale the pixels' values to [0..1] when converting the image data type from uint to floating point (see, e.g., Matlab's im2double, im2single).
Some image formats have data in range [0..65536] (2 bytes/pixel), in such cases it is convinient to keep the range fixed and only play with the scale.

Understanding load_image() method in pycaffe

Answers (1)

Related Questions