Reputation: 45931
I'm trying to migrate this code 3.3 Omniglot Character set classification using Prototypical Network.ipynb, from Tensorflow 1.1 to Tensorflow 2.x.
My doubt is that I don't know what I'm really doing. The code where I have the problem is:
import numpy as np
import tensorflow as tf
def convolution_block(inputs, out_channels, name='conv'):
conv = tf.layers.conv2d(inputs, out_channels, kernel_size=3, padding='SAME')
conv = tf.contrib.layers.batch_norm(conv, updates_collections=None, decay=0.99, scale=True, center=True)
conv = tf.nn.relu(conv)
conv = tf.contrib.layers.max_pool2d(conv, 2)
return conv
def get_embeddings(support_set, h_dim, z_dim, reuse=False):
net = convolution_block(support_set, h_dim)
net = convolution_block(net, h_dim)
net = convolution_block(net, h_dim)
net = convolution_block(net, z_dim)
net = tf.contrib.layers.flatten(net)
return net
And I have migrated to:
import tensorflow as tf
from tensorflow.keras.layers import Input, Conv2D, MaxPooling2D
def get_embedding_function(img_shape):
inputs = Input(img_shape)
conv1 = Conv2D(64, (3, 3), activation='relu', padding='same', name='conv1_1')(inputs)
pool1 = MaxPooling2D(pool_size=(2, 2), data_format='channels_last', name='pool1')(conv1)
conv2 = Conv2D(96, (3, 3), activation='relu', padding='same', name='conv2_1')(pool1)
pool2 = MaxPooling2D(pool_size=(2, 2), data_format='channels_last', name='pool2')(conv2)
conv3 = Conv2D(128, (3, 3), activation='relu', padding='same', name='conv3_1')(pool2)
pool3 = MaxPooling2D(pool_size=(2, 2), data_format='channels_last', name='pool3')(conv3)
conv4 = Conv2D(256, (3, 3), activation='relu', padding='same', name='conv4_1')(pool3)
pool4 = MaxPooling2D(pool_size=(2, 2), data_format='channels_last', name='pool4')(conv4)
model = tf.keras.models.Model(inputs=inputs, outputs=pool4)
model.compile(tf.keras.optimizers.Adam(lr=(1e-4) * 2), loss='binary_crossentropy', metrics=['accuracy'])
return model
This function does not have the same layers than the previous one because I want to test my own network.
I'm going to use this function to extract features from images.
I have had to add model = tf.keras.models.Model(inputs=inputs, outputs=pool4)
because if I only return pool4
it doesn't work. And I have also added model.compile(tf.keras.optimizers.Adam(lr=(1e-4) * 2), loss='binary_crossentropy', metrics=['accuracy'])
but I don't know if I need it.
Do I need to create the model and compile it to extract features from an image?
Upvotes: 0
Views: 371
Reputation: 11651
There is mainly two ways of training a network using the keras API :
fit
methodIn both case, if using the keras
API, you need to create a Model, which is collection of connected layers.
Lets define a simple MLP (multi layer perceptron) Model using keras:
import tensorflow as tf
inp = tf.keras.Input((1,))
hidden = tf.keras.layers.Dense(10, activation="tanh")(inp)
out = tf.keras.layers.Dense(1, activation="simgoid")(hidden)
model = tf.keras.Model(inputs=inp, outputs=out)
Lets also generate some simple synthetic data
x = tf.random.normal((100,1))
y = 2*x + 1
Note that using a MLP for a regression as simple as this is overkill. A simple linear regression would be enough.
fit
If you want to use the fit
method, then you need to compile the model.
Compiling the model is akin to providing the training strategy of the model : which objective function (loss) to use, with which optimization algorithm.
In that case, lets use a simple mean square error as a loss, and a SGD as the optimization algorithm. Once that's done, you can simply call fit
on your data.
>>> model.compile(optimizer="sgd", loss="mse")
>>> model.fit(x,y)
4/4 [==============================] - 0s 2ms/step - loss: 4.5469
The fit method provides plenty of options, you can explore them by looking at the documentation.
Sometimes, using the fit method is not flexible enough. In that case, it is possible to train the model from scratch by writing a training loop. In that case, you need to define everything yourself. If I want to use SDG as an optimizer, and a mean squared error loss function, I can do it that way:
opt = tf.optimizers.SGD()
for data,label in zip(x,y):
with tf.GradientTape() as tape:
pred = model(data)
loss = tf.losses.mse(pred,label)
grad = tape.gradient([loss], model.weights)
opt.apply_gradients(zip(grad, model.weights))
This approach is more flexible, but also more verbose. In that case, I don't need to compile the model. The compiling is only the method that will make the optimizer and the loss function known to fit method
.
You can read more about :
fit
in the Basic classification: Classify images of clothing tutorialUpvotes: 1
Reputation: 4313
Compile model or not is depend by the method you use to training the model, i.e:
If you using model.fit
then you need compile the model before fit the model
If you using custom training then you don't have to, just return the model and using it like:
optimizer = tf.keras.optimizers.Adam(lr=(1e-4) * 2)
bce = tf.keras.losses.BinaryCrossentropy()
tf.GradientTape() as tape:
y_pred = model(X)
loss = bce(y_true, y_pred)
grads = tape.gradient(loss , model.trainable_variables)
optimizer.apply_gradients(zip(grads , model.trainable_variables))
Upvotes: 1