Reputation: 792
I am trying to rewrite a Sequential model of Network In Network CNN using Functional API. I use it with CIFAR-10 dataset. The Sequential model trains without a problem, but Functional API model gets stuck. I probably missed something when rewriting the model.
Here's a reproducible example:
Dependencies:
from keras.models import Model, Input, Sequential
from keras.layers import Conv2D, MaxPooling2D, GlobalAveragePooling2D, Dropout, Activation
from keras.utils import to_categorical
from keras.losses import categorical_crossentropy
from keras.optimizers import Adam
from keras.datasets import cifar10
Loading the dataset:
(x_train, y_train), (x_test, y_test) = cifar10.load_data()
x_train = x_train / 255.
x_test = x_test / 255.
y_train = to_categorical(y_train, num_classes=10)
y_test = to_categorical(y_test, num_classes=10)
input_shape = x_train[0,:,:,:].shape
Here's the working Sequential model:
model = Sequential()
#mlpconv block1
model.add(Conv2D(32, (5, 5), activation='relu',padding='valid',input_shape=input_shape))
model.add(Conv2D(32, (1, 1), activation='relu'))
model.add(Conv2D(32, (1, 1), activation='relu'))
model.add(MaxPooling2D((2,2)))
model.add(Dropout(0.5))
#mlpconv block2
model.add(Conv2D(64, (3, 3), activation='relu',padding='valid'))
model.add(Conv2D(64, (1, 1), activation='relu'))
model.add(Conv2D(64, (1, 1), activation='relu'))
model.add(MaxPooling2D((2,2)))
model.add(Dropout(0.5))
#mlpconv block3
model.add(Conv2D(128, (3, 3), activation='relu',padding='valid'))
model.add(Conv2D(32, (1, 1), activation='relu'))
model.add(Conv2D(10, (1, 1), activation='relu'))
model.add(GlobalAveragePooling2D())
model.add(Activation('softmax'))
Compile and train:
model.compile(loss=categorical_crossentropy, optimizer=Adam(), metrics=['acc'])
_ = model.fit(x=x_train, y=y_train, batch_size=32,
epochs=200, verbose=1,validation_split=0.2)
In three epochs the model gets close to 50% validation accuracy.
Here's the same model rewritten using Functional API:
model_input = Input(shape=input_shape)
#mlpconv block1
x = Conv2D(32, (5, 5), activation='relu',padding='valid')(model_input)
x = Conv2D(32, (1, 1), activation='relu')(x)
x = Conv2D(32, (1, 1), activation='relu')(x)
x = MaxPooling2D((2,2))(x)
x = Dropout(0.5)(x)
#mlpconv block2
x = Conv2D(64, (3, 3), activation='relu',padding='valid')(x)
x = Conv2D(64, (1, 1), activation='relu')(x)
x = Conv2D(64, (1, 1), activation='relu')(x)
x = MaxPooling2D((2,2))(x)
x = Dropout(0.5)(x)
#mlpconv block3
x = Conv2D(128, (3, 3), activation='relu',padding='valid')(x)
x = Conv2D(32, (1, 1), activation='relu')(x)
x = Conv2D(10, (1, 1), activation='relu')(x)
x = GlobalAveragePooling2D()(x)
x = Activation(activation='softmax')(x)
model = Model(model_input, x, name='nin_cnn')
This model is then compiled using the same parameters as the Sequential model. When trained, the training accuracy gets stuck at 0.10
, meaning the model doesn't get better and randomly chooses one of 10 classes.
What did I miss when rewriting the model? When calling model.summary()
the models look identical except for the explicit Input
layer in the Functional API model.
Upvotes: 0
Views: 1442
Reputation: 792
Removing activation
in the final conv layer solves the problem:
x = Conv2D(10, (1, 1))(x)
Still not sure why the Sequential model works fine with activation in that layer.
Upvotes: 1