Reputation: 255
I trained a Keras model with the following architecture:
def make_model(input_shape, num_classes):
inputs = keras.Input(shape=input_shape)
# Image augmentation block
x = inputs
# Entry block
x = layers.experimental.preprocessing.Rescaling(1.0 / 255)(x)
x = layers.Conv2D(32, 3, strides=2, padding="same")(x)
x = layers.BatchNormalization()(x)
x = layers.Activation("relu")(x)
x = layers.Conv2D(64, 3, padding="same")(x)
x = layers.BatchNormalization()(x)
x = layers.Activation("relu")(x)
previous_block_activation = x # Set aside residual
for size in [128, 256, 512, 728]:
x = layers.Activation("relu")(x)
x = layers.SeparableConv2D(size, 3, padding="same")(x)
x = layers.BatchNormalization()(x)
x = layers.Activation("relu")(x)
x = layers.SeparableConv2D(size, 3, padding="same")(x)
x = layers.BatchNormalization()(x)
x = layers.MaxPooling2D(3, strides=2, padding="same")(x)
# Project residual
residual = layers.Conv2D(size, 1, strides=2, padding="same")(
previous_block_activation
)
x = layers.add([x, residual]) # Add back residual
previous_block_activation = x # Set aside next residual
x = layers.SeparableConv2D(1024, 3, padding="same")(x)
x = layers.BatchNormalization()(x)
x = layers.Activation("relu")(x)
x = layers.GlobalAveragePooling2D()(x)
if num_classes == 2:
activation = "sigmoid"
units = 1
else:
activation = "softmax"
units = num_classes
x = layers.Dropout(0.5)(x)
outputs = layers.Dense(units, activation=activation)(x)
return keras.Model(inputs, outputs)
And that model has over 2 million trainable parameters.
I then trained a much lighter model with only 300,000. trainable parameters:
def make_model(input_shape, num_classes):
inputs = keras.Input(shape=input_shape)
# Image augmentation block
x = inputs
# Entry block
x = layers.experimental.preprocessing.Rescaling(1.0 / 255)(x)
x = layers.Conv2D(64, kernel_size=(7, 7), activation=tf.keras.layers.LeakyReLU(alpha=0.01), padding = "same", input_shape=image_size + (3,))(x)
x = layers.MaxPooling2D(pool_size=(2, 2))(x)
x = layers.Conv2D(192, kernel_size=(3, 3), activation=tf.keras.layers.LeakyReLU(alpha=0.01), padding = "same", input_shape=image_size + (3,))(x)
x = layers.Conv2D(128, kernel_size=(1, 1), activation=tf.keras.layers.LeakyReLU(alpha=0.01), padding = "same", input_shape=image_size + (3,))(x)
x = layers.MaxPooling2D(pool_size=(2, 2))(x)
x = layers.Conv2D(128, kernel_size=(3, 3), activation=tf.keras.layers.LeakyReLU(alpha=0.01), padding = "same", input_shape=image_size + (3,))(x)
x = layers.MaxPooling2D(pool_size=(2, 2))(x)
x = layers.Dropout(0.5)(x)
x = layers.GlobalAveragePooling2D()(x)
if num_classes == 2:
activation = "sigmoid"
units = 1
else:
activation = "softmax"
units = num_classes
x = layers.Dropout(0.5)(x)
outputs = layers.Dense(units, activation=activation)(x)
return keras.Model(inputs, outputs)
However, the last model (which is much lighter and even accepts a smaller input size) seems to run at the same speed, only classifying at 2 images per second. Shouldn't there be a difference in speed being it's a smaller model? Looking at the code, is there a glaring reason why that wouldn't be the case?
I'm using the same method at inference in both cases:
image_size = (180, 180)
batch_size = 32
model = keras.models.load_model('model_13.h5')
t_end = time.time() + 10
iters = 0
while time.time() < t_end:
img = keras.preprocessing.image.load_img(
"test2.jpg", target_size=image_size
)
img_array = image.img_to_array(img)
#print(img_array.shape)
img_array = tf.expand_dims(img_array, 0) # Create batch axis
predictions = model.predict(img_array)
score = predictions[0]
print(score)
iters += 1
if score < 0.5:
print('Fire')
else:
print('No Fire')
print('TOTAL: ', iters)
Upvotes: 3
Views: 418
Reputation: 524
The number of parameters is at most and indication how fast a model trains or runs inference. It might depend on many other factors.
Here some examples, which might influence the throughput of your model:
It's hard to tell without testing but in your particular example I would guess, that the following might slow down your inference:
To understand whats going on, you could either change some parameters and evaluate the speed, or you could analyze your input pipeline by tracing your hardware using tensorboard. Here is a smal guide: https://www.tensorflow.org/tensorboard/tensorboard_profiling_keras
Best, Sascha
Upvotes: 2