Reputation: 413
I'm working on a multi-class classification using InceptionV3 using tensorflow. I thought I got it all right, but the result is very weird and nowhere close to what I wanted.
Epoch 1/10
58/58 [==============================] - 47s 591ms/step - loss: 0.0000e+00 - accuracy: 0.0279
Epoch 2/10
58/58 [==============================] - 38s 591ms/step - loss: 0.0000e+00 - accuracy: 0.0286
Epoch 3/10
58/58 [==============================] - 38s 596ms/step - loss: 0.0000e+00 - accuracy: 0.0249
Epoch 4/10
58/58 [==============================] - 38s 603ms/step - loss: 0.0000e+00 - accuracy: 0.0250
Here is my code.
This chunk of code is where I deal with the data. raw_train comes from oxford_iiit_pet.
dataset, metadata = tfds.load('oxford_iiit_pet', with_info=True, as_supervised=True)
raw_train, raw_test = dataset['train'], dataset['test']
IMAGE_SIZE = (224, 224)
def preprocess_dataset(image, label):
image = tf.cast(image, tf.float32)
image = (image/127.5)-1 # might need to fix this part
image = tf.image.resize(image, IMAGE_SIZE)
return image, label
BATCH_SIZE = 64
SHUFFLE_BUFFER_SIZE = 1024
train = raw_train.map(preprocess_dataset)
test = raw_test.map(preprocess_dataset)
train_batches = train.shuffle(SHUFFLE_BUFFER_SIZE).batch(BATCH_SIZE)
test_batches = test.batch(BATCH_SIZE)
This is my model.
IMG_SHAPE = (IMAGE_SIZE[0], IMAGE_SIZE[1], 3)
model_inception = tf.keras.applications.InceptionV3(input_shape=IMG_SHAPE, include_top = False, weights = 'imagenet')
model_inception.trainable = False # freeze the model
learning_rate = 0.001
model = tf.keras.Sequential([
model_inception,
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(512, activation='relu'),
tf.keras.layers.Dense(1, activation='sigmoid')
])
model.compile(
optimizer = tf.keras.optimizers.RMSprop(lr=learning_rate),
loss = 'categorical_crossentropy',
metrics=['accuracy']
)
EPOCHS = 10
history = model.fit(
train_batches,
epochs = EPOCHS
)
I'm not entirely sure if it's the way I preprocessed the data or the way I set up the model. Everything seems fine until I actually run the model. It seems like vanishing gradient that is happening but I don't know why and if it's actually the issue. I've looked up to see how the model is used but nothing seems to give an answer clearly.
Upvotes: 0
Views: 141
Reputation: 164
So, you're outputting a single value that's in the domain of [0.0,1.0]
. But you say that there are multiple classes, not just one class: so you'll want to make the number of neurons in your last layer however many classes there are, and change it from a sigmoid
activation to a softmax
activation. Like this:
NUM_CLASSES = 8 # edit this number to suit your problem
model = tf.keras.Sequential([
model_inception,
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(512, activation='relu'),
tf.keras.layers.Dense(NUM_CLASSES, activation='softmax')
])
sigmoid
won't work properly, even though it will look like it is (because all the values are floats between 0 and 1, just like softmax
will give you).
Frankly I'm not even sure how you aren't getting some sort of error when calling fit()
. That's one of the problems with Keras, imho: it's so user friendly that it'll "work" (which is to say, run, not work) even when it should be giving you a hint about how you've set up your data improperly. Unless the training targets in your training set consist of a single float value for every input image, it should not be even running.
Upvotes: 1