Mattia_C
Mattia_C

Reputation: 1

How to generate a tensor of desired features, extracted from RGB images using pre-trained models?

I would like to generate a 1D vector of features extracted from RGB images (256 x 256 x 3), using pre-trained models. Suppose I started from a tensor whose shape is (N_images, 256, 256, 3) I would like to obtain a tensor whose shape is (N_images, M_features), where M_features is the number of features, chose by the user. I found a feasible solution in the keras/tensorflow documentation (see: “Extract features with VGG16”) and I try the following code (using ResNet50):

from tensorflow.keras.applications.resnet50 import ResNet50
from tensorflow.keras.preprocessing import image
from tensorflow.keras.applications.resnet50 import preprocess_input, decode_predictions
N_images= img_data.shape[0]
model = ResNet50(weights='imagenet', include_top=False, input_shape=(256,256,3))
model.summary()
img_data = preprocess_input(img_data)
res_feature = model.predict(img_data)
res_feature.shape

However, the shape of the feature set is (N_images, 8 ,8 ,2048). Therefore, I added a GlobalAveragePooling2D layer:

from tensorflow.keras.applications.resnet50 import ResNet50
from tensorflow.keras.layers import Dense, GlobalAveragePooling2D
from tensorflow.keras.preprocessing import image
from tensorflow.keras.applications.resnet50 import preprocess_input, decode_predictions
from tensorflow.keras import Model, Input, regularizers
N_images= img_data.shape[0]
base_model = ResNet50(weights='imagenet', include_top=False, input_shape=(256,256,3))**strong text**
x = base_model.output
x = GlobalAveragePooling2D() (x)
model = Model(inputs=base_model.input, outputs=x)
img_data = preprocess_input(img_data)
res_feature = model.predict(img_data)
res_feature.shape

In this case, the shape of the output tensor is (N_images, 2048), that could be ok but I would like to chose a specific number of desired features. Thanks in advance.

Upvotes: 0

Views: 160

Answers (1)

Alberto
Alberto

Reputation: 12949

You probably want an autoencoder, which is basically trying to do dimensionality reduction of a convex latent space... it's simpler than what it seems (given a desired dimension M):

  • create the "dataset":
res_feature = model.predict(img_data)
res_feature = np.reshape(res_feature, (len(res_feature), -1))
  • create the autoencoder:
input = tf.keras.layers.Input(shape=res_feature.shape[1:])
encoder = tf.keras.layers.Dense(M, activation="selu")(input)
decoder = tf.keras.layers.Dense(shape=res_feature.shape[1:], activation="linear")

model = tf.keras.models.Model(inputs=input, outputs=decoder)
  • compile and train it with your fav optimizer and the correct loss

Upvotes: 1

Related Questions