DQ_happy
DQ_happy

Reputation: 515

Image pixel value normalized for tf.image.decode_jpeg and tf.train.shuffle_batch?

I am trying to use the tf.train.shuffle_batch function from tensorflow, then I need to first load the images using tf.image.decode_jpeg(or other similar functions to load png and jpg). But I just found out that the images are loaded as probability map, which means the max of the value of pixel is 1, and the min of the value of the pixel is 0. Below is my code updated from a github repo. I don't know why the values of pixels are normalized to [0,1], and I don't find related documentation on tensorflow. Could anyone help me? Thanks.

def load_examples(self, input_dir,  flip, scale_size, batch_size, min_queue_examples):
    input_paths = get_image_paths(input_dir)
    with tf.name_scope("load_images"):
        path_queue = tf.train.string_input_producer(input_paths)
        reader = tf.WholeFileReader()
        paths, contents = reader.read(path_queue)
        # note this is important for truncated images
        raw_input = tf.image.decode_jpeg(contents,try_recover_truncated = True, acceptable_fraction=0.5)
        raw_input = tf.image.convert_image_dtype(raw_input, dtype=tf.float32)
        raw_input.set_shape([None, None, 3])

        # break apart image pair and move to range [-1, 1]
        width = tf.shape(raw_input)[1]  # [height, width, channels]
        a_images = preprocess(raw_input[:, :width // 2, :])
        b_images = raw_input[:, width // 2:, :]

    inputs, targets = [a_images, b_images]

    def transform(image):
        r = image

        r = tf.image.resize_images(r, [self.image_height, self.image_width], method=tf.image.ResizeMethod.AREA)
        return r
    def transform_gaze(image):
        r = image
        r = tf.image.resize_images(r, [self.gaze_height, self.gaze_width], method=tf.image.ResizeMethod.AREA)
        return r
    with tf.name_scope("input_images"):
        input_images = transform(inputs)

    with tf.name_scope("target_images"):
        target_images = transform(targets)
    total_image_count = len(input_paths)
    # target_images = tf.image.per_image_standardization(target_images)
    target_images = target_images[:,:,0]
    target_images = tf.expand_dims(target_images, 2)
    inputs_batch, targets_batch = tf.train.shuffle_batch([input_images, target_images],
                                         batch_size=batch_size,
                                         num_threads=1,
                                         capacity=min_queue_examples + 3 * batch_size,
                                         min_after_dequeue=min_queue_examples)
    # inputs_batch, targets_batch = tf.train.batch([input_images, target_images],batch_size=batch_size)
    return inputs_batch, targets_batch, total_image_count

Upvotes: 1

Views: 2646

Answers (1)

nessuno
nessuno

Reputation: 27050

Values are into [0,1] because is what tf.image.decode_* methods do.

In general, when a method returns a float tensor, its values are supposed to be in the [0,1] range, whilst if the returned tensor is a uint8 the values are supposed to be in the [0,255] range.

Also, when you use the tf.image.convert_image_dtype method, to convert the dtype of the input image, you're applying that conversion rules.

If your input image is a uint8 image and you convert it to a float32, the values are scaled in the [0,1] range. If your image is already a float, its values are supposed to be in that range and nothing is done.

Upvotes: 4

Related Questions