Reputation: 43
I'm attempting to train a network on the Places2 dataset and have arranged all the classes into subfolders. When the training and validation datasets are loaded via:
from tensorflow.keras.preprocessing import image_dataset_from_directory
train_ds = image_dataset_from_directory(
"S:/Places2",
image_size=(224, 224),
batch_size=128,
)
validation_ds = image_dataset_from_directory(
"S:/Places2Val",
image_size=(224, 224),
batch_size=128,
)
the console reports that all the images have been found in the correct number of classes:
Found 1803460 files belonging to 365 classes.
Found 36501 files belonging to 365 classes.
However, when trying to train the following network:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import mixed_precision
from tensorflow.keras.applications import EfficientNetB2
start = keras.Input((224, 224, 3))
# main network
base = EfficientNetB2(
include_top=False,
weights="imagenet",
input_shape=(224, 224, 3)
)
base.trainable = False
base = base(start)
# model head
top = keras.layers.AveragePooling2D()(base)
top = keras.layers.Flatten()(top)
top = keras.layers.Dense(128, activation="relu")(top)
top = keras.layers.Dropout(0.5)(top)
top = keras.layers.Dense(128, activation="relu")(top)
top = keras.layers.Dropout(0.5)(top)
top = keras.layers.Dense(365, activation="softmax")(top)
# build model and print summary
model = keras.Model(inputs=start, outputs=top)
model.summary()
# optimiser
opt = keras.optimizers.SGD(lr=0.1, momentum=0.9, nesterov=True)
# Assemble model with appropriate loss function
model.compile(loss="sparse_categorical_crossentropy", optimizer=opt, metrics=['Accuracy'])
# Train and save model
model.fit(
train_ds,
validation_data=validation_ds,
epochs=1,
batch_size=128,
verbose=1
)
model.save("places.tf")
It throws the incompatible shapes error:
ValueError: Shapes (None, 365) and (None, 1) are incompatible
This is despite image_dataset_from_directory returning (by default) inferred labels with integer labelling, explicitly to work with sparse categorical crossentropy. If the model has the correct number of outputs, and the data is loading the correct number of image categories, then why is one of the output shapes incorrect?
Particularly confusing is how changing the loss to categorical_crossentropy
rearranges the error to:
ValueError: Shapes (None, 1) and (None, 365) are incompatible
Printing the labels of the first batch with
for images, labels in train_ds.take(1):
print(labels)
shows that the labels are formatted as expected--i.e. a length 128 tensor of integer labels. This should be compatible with sparse categorical crossentropy.
tf.Tensor(
[ 17 226 130 186 177 34 342 33 277 284 333 358 245 263 33 72 50 139
298 331 250 241 50 48 264 276 218 236 303 355 3 185 107 329 277 299
10 314 62 141 221 200 9 64 227 288 253 234 77 174 358 69 277 345
361 205 8 197 194 217 114 135 296 305 278 82 355 134 300 129 76 321
167 296 90 299 291 344 29 291 202 333 168 257 354 79 142 77 280 5
261 234 78 90 250 245 302 189 97 194 347 272 54 256 160 55 131 206
284 51 347 163 313 354 263 63 190 150 220 22 102 33 8 35 97 13
16 277], shape=(128,), dtype=int32)
Upvotes: 1
Views: 413
Reputation: 8092
well after a few hours of head scratching I figured it out! All you have to do is in model.compile change metrics=['Accuracy'] to metrics=['accuracy']. I went back to an old network I built a few years back that used sparse_categorical_crossentropy and went through it line by line.
Upvotes: 1