Reputation: 87
I am trying to build my own network using ResNet152 in Colab. Here is the part of my code :
(x_train, y_train), (x_test, y_test) = cifar10.load_data()
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
# Normalization
mean = np.mean(x_train, axis=0)
std = np.std(x_train, axis=0)
x_train = (x_train - mean) / (std+1e-7)
x_test = (x_test - mean) / (std+1e-7)
y_train = to_categorical(y_train, num_labels)
y_test = to_categorical(y_test, num_labels)
Then data augmentation
datagen = ImageDataGenerator(
rotation_range=9,
width_shift_range=.2, height_shift_range=.2,
horizontal_flip=True, vertical_flip=True,
rescale=True,
validation_split=.2
)
datagen.fit(x_train)
datagen_train = datagen.flow(x_train, y_train, shuffle=True, subset='training')
datagen_val = datagen.flow(x_train, y_train, shuffle=True, subset='validation')
And then model building:
model = Sequential()
model.add(ResNet152(include_top=False, pooling='avg'))
model.add(Dense(num_labels, activation='softmax'))
for layer in model.layers[:]:
layer.trainable = True
But Colab is giving me very poor results after fitting the generator. Are there any problems in my code?
Upvotes: 0
Views: 1343
Reputation: 14993
The only problem in your code is that you set all your layers to trainable: practically you do not benefit from the concept of transfer learning; depending on:
you must freeze a different number of layers in order to have good results.
The image above summarizes what I have stated before. I would start by training only the last layer (Dense
) for 2-3 epochs, and then train, say, the last quarter of layers (starting from 3/4 index up to the last).
Upvotes: 3