Chris K
Chris K

Reputation: 1723

ResNet50 nan loss with Keras 2

Since upgrading to Keras 2 I'm seeing nan loss when trying to fine tune ResNet50. Loss and accuracy look ok if I use a single convolutional layer (commented out below) instead of resnet. Am I missing something that changed with Keras 2?

from keras.applications.resnet50 import ResNet50
from keras.layers import Flatten, Dense, Input, Conv2D, Activation, Flatten
from keras.layers.pooling import MaxPooling2D
from keras.models import Model
from keras.optimizers import SGD
import numpy as np

inp = Input(batch_shape=(32, 224, 224, 3), name='input_image')

### resnet
modelres = ResNet50(weights="imagenet", include_top=False, input_tensor=inp)
x = modelres.output
x = Flatten()(x)

### single convolutional layer
#x = Conv2D(32, (3,3))(inp)
#x = Activation('relu')(x)
#x = MaxPooling2D(pool_size=(3,3))(x)
#x = Flatten()(x)
#x = Dense(units=32)(x)
predictions = Dense(units=2, kernel_initializer="he_normal", activation="softmax")(x) 

model = Model(inputs=inp, outputs=predictions)
model.compile(SGD(lr=.001, momentum=0.9), "categorical_crossentropy", metrics=["accuracy"])

# generate images of all ones with the same label
def gen():
    while True:
        x_data = np.ones((32,224,224,3)).astype('float32')
        y_data = np.zeros((32,2)).astype('float32')
        yield x_data, y_data

model.fit_generator(gen(), 10, validation_data=gen(), validation_steps=1)

The beginning and end of model.summary() looks like:

Layer (type)                     Output Shape          Param #     Connected to
input_image (InputLayer)         (32, 224, 224, 3)     0
zero_padding2d_1 (ZeroPadding2D) (32, 230, 230, 3)     0
conv1 (Conv2D)                   (32, 112, 112, 64)    9472


avg_pool (AveragePooling2D)      (32, 1, 1, 2048)      0
flatten_1 (Flatten)              (32, 2048)            0
dense_1 (Dense)                  (32, 2)               4098

Training output is:

Epoch 1/1
10/10 [==============================] - 30s - loss: nan - acc: 0.0000e+00 - val_loss: nan - val_acc: 0.0000e+00

Upvotes: 4

Views: 1361

Answers (1)

Chris K
Chris K

Reputation: 1723

Everything works fine when I switch the backend to tensorflow instead of theano. Looks like something about the theano implementation broke in keras 2.

Upvotes: 3

Related Questions