Vanishing gradient and very low accuracy in InceptionV3

Question

I'm working on a multi-class classification using InceptionV3 using tensorflow. I thought I got it all right, but the result is very weird and nowhere close to what I wanted.

Epoch 1/10
58/58 [==============================] - 47s 591ms/step - loss: 0.0000e+00 - accuracy: 0.0279
Epoch 2/10
58/58 [==============================] - 38s 591ms/step - loss: 0.0000e+00 - accuracy: 0.0286
Epoch 3/10
58/58 [==============================] - 38s 596ms/step - loss: 0.0000e+00 - accuracy: 0.0249
Epoch 4/10
58/58 [==============================] - 38s 603ms/step - loss: 0.0000e+00 - accuracy: 0.0250

Here is my code.

This chunk of code is where I deal with the data. raw_train comes from oxford_iiit_pet.

dataset, metadata = tfds.load('oxford_iiit_pet', with_info=True, as_supervised=True)
raw_train, raw_test = dataset['train'], dataset['test']
IMAGE_SIZE = (224, 224)


def preprocess_dataset(image, label):
  image = tf.cast(image, tf.float32)
  image = (image/127.5)-1 # might need to fix this part
  image = tf.image.resize(image, IMAGE_SIZE)
  return image, label

BATCH_SIZE = 64
SHUFFLE_BUFFER_SIZE = 1024

train = raw_train.map(preprocess_dataset)
test = raw_test.map(preprocess_dataset)

train_batches = train.shuffle(SHUFFLE_BUFFER_SIZE).batch(BATCH_SIZE)
test_batches = test.batch(BATCH_SIZE)

This is my model.

IMG_SHAPE = (IMAGE_SIZE[0], IMAGE_SIZE[1], 3)
model_inception = tf.keras.applications.InceptionV3(input_shape=IMG_SHAPE, include_top = False, weights = 'imagenet')

model_inception.trainable = False # freeze the model

learning_rate = 0.001

model = tf.keras.Sequential([
    model_inception,
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(512, activation='relu'),
    tf.keras.layers.Dense(1, activation='sigmoid')
])

model.compile(
    optimizer = tf.keras.optimizers.RMSprop(lr=learning_rate),
    loss = 'categorical_crossentropy',
    metrics=['accuracy']
)

EPOCHS = 10

history = model.fit(
    train_batches,
    epochs = EPOCHS
)

I'm not entirely sure if it's the way I preprocessed the data or the way I set up the model. Everything seems fine until I actually run the model. It seems like vanishing gradient that is happening but I don't know why and if it's actually the issue. I've looked up to see how the model is used but nothing seems to give an answer clearly.

Vanishing gradient and very low accuracy in InceptionV3

Answers (1)

Related Questions