Chris K
Chris K

Reputation: 1723

ResNet50 nan loss with Keras 2

Since upgrading to Keras 2 I'm seeing nan loss when trying to fine tune ResNet50. Loss and accuracy look ok if I use a single convolutional layer (commented out below) instead of resnet. Am I missing something that changed with Keras 2?

from keras.applications.resnet50 import ResNet50
from keras.layers import Flatten, Dense, Input, Conv2D, Activation, Flatten
from keras.layers.pooling import MaxPooling2D
from keras.models import Model
from keras.optimizers import SGD
import numpy as np

inp = Input(batch_shape=(32, 224, 224, 3), name='input_image')

### resnet
modelres = ResNet50(weights="imagenet", include_top=False, input_tensor=inp)
x = modelres.output
x = Flatten()(x)

### single convolutional layer
#x = Conv2D(32, (3,3))(inp)
#x = Activation('relu')(x)
#x = MaxPooling2D(pool_size=(3,3))(x)
#x = Flatten()(x)
#x = Dense(units=32)(x)
predictions = Dense(units=2, kernel_initializer="he_normal", activation="softmax")(x) 

model = Model(inputs=inp, outputs=predictions)
model.compile(SGD(lr=.001, momentum=0.9), "categorical_crossentropy", metrics=["accuracy"])

# generate images of all ones with the same label
def gen():
    while True:
        x_data = np.ones((32,224,224,3)).astype('float32')
        y_data = np.zeros((32,2)).astype('float32')
        y_data[:,1]=1.0
        yield x_data, y_data

model.fit_generator(gen(), 10, validation_data=gen(), validation_steps=1)

The beginning and end of model.summary() looks like:

____________________________________________________________________________________________________
Layer (type)                     Output Shape          Param #     Connected to
====================================================================================================
input_image (InputLayer)         (32, 224, 224, 3)     0
____________________________________________________________________________________________________
zero_padding2d_1 (ZeroPadding2D) (32, 230, 230, 3)     0
____________________________________________________________________________________________________
conv1 (Conv2D)                   (32, 112, 112, 64)    9472

...

avg_pool (AveragePooling2D)      (32, 1, 1, 2048)      0
____________________________________________________________________________________________________
flatten_1 (Flatten)              (32, 2048)            0
____________________________________________________________________________________________________
dense_1 (Dense)                  (32, 2)               4098
====================================================================================================

Training output is:

Epoch 1/1
10/10 [==============================] - 30s - loss: nan - acc: 0.0000e+00 - val_loss: nan - val_acc: 0.0000e+00

Upvotes: 4

Views: 1361

Answers (1)

Chris K
Chris K

Reputation: 1723

Everything works fine when I switch the backend to tensorflow instead of theano. Looks like something about the theano implementation broke in keras 2.

Upvotes: 3

Related Questions