Reputation: 21
I've been playing with different models from TF hub to extract feture vectors:
module = hub.load('https://tfhub.dev/google/tf2-preview/inception_v3/feature_vector/4')
features = module(image)
What i don't quite understand is how input image should be preprocessed. Every model from the hub has this generic instruction:
The input image1 are expected to have color values in the range [0,1], following the common image input conventions. The expected size of the input images is height x width = 299 x 299 pixels by default, but other input sizes are possible (within limits). where "common image input" is a link to a the following: A signature that takes a batch of images as input accepts them as a dense 4-D tensor of dtype float32 and shape [batch_size, height, width, 3] whose elements are RGB color values of pixels normalized to the range [0, 1]. This is what you get from tf.image.decode_*() followed by tf.image.convert_image_dtype(..., tf.float32).
and this is indeed what i see quite often online:
image = tf.io.read_file(path)
# Decodes the image to W x H x 3 shape tensor with type of uint8
image = tf.io.decode_jpeg(image, channels=3)
# Resize the image to for model
image = tf.image.resize(image, [model_input_size, model_input_size])
# 1 x model_input_size x model_input_size x 3 tensor with the data type of float32
image = tf.image.convert_image_dtype(image, tf.float32)[tf.newaxis, ...]
BUT, color values are expected to be in range [0,1], in this case colors are in range [0,255] and should be scaled down:
image = numpy.array(image) * (1. / 255)
Is it just a common mistake or is the TF documentation is not up to date?
I was playing with models from tf.keras.applications and reading source code in github. I noticed in some of the models (EfficientNet) first layer is:
x = layers.Rescaling(1. / 255.)(x)
but in some models there is no such layer, instead and an utility function scales colors to [0,1] range, for example tf.keras.applications.mobilenet.preprocess_input.
So, how important for TF hub saved models image colors to be in [0,1] range?
Upvotes: 2
Views: 472
Reputation: 603
This is just a convention TF Hub proposes: "Models for the same task are encouraged to implement a common API so that model consumers can easily exchange them without modifying the code that uses them, even if they come from different publishers" (from here).
As you've noted, the publisher of google/tf2-preview/inception_v3/feature_vector/4 decided that input images "are expected to have color values in the range [0,1]", while the publisher of tensorflow/efficientdet/d1/1 decided to add a Rescaling
layer to the model itself such that "[a tensor] with values in [0, 255]" can be passed. So ultimately, it's up to the publisher how they implement their model. In any case, when using models from tfhub.dev, the expected preprocessing steps will always be documented on the model page.
Upvotes: 1