Keras unsymmetrical data Diabetic Retinopathy Detection

Question

I'm trying to make a predictive model for Diabetic Retinopathy Detection. The competition's trainig dataset includes hy-res images are unsymmetricaly divided in 5 classes: Normal-25807 images-73.48%; Mild-2442 images-6.96%; Moderate-5291 images-15.07%; Severe-873 images-2.48% and Proliferative-708 images - 2.01%. For this purpose I use Keras framework with Theano backend (for CUDA comutations).

For image augmentation I used the ImageDataGenerator (the code is below). I've resized images to 299x299 and divided them into 5 folders accordingly their classes:

train_datagen=ImageDataGenerator(rescale=1./255, rotation_range=40, zoom_range=0.2, horizontal_flip=True, fill_mode="constant", zca_whitening=True)
train_generator=train_datagen.flow_from_directory('data/~huge_data/preprocessed_imgs/', target_size=(299, 299), batch_size=32, class_mode='categorical')

At first, just for testing, I desided to use a simple convolutional model:

model=Sequential()
model.add(Convolution2D(32,3,3, input_shape=(3, 299, 299), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Convolution2D(32, 3, 3, activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Convolution2D(64, 3, 3, activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Flatten())
model.add(Dense(64, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(5, activation='softmax'))

model.compile(loss='categorical_crossentropy', 
              optimizer='rmsprop', 
              metrics=['accuracy'])

In fitting Image generator, I pointed the class_weights in order to fix the asymmetry of data: class_weight ={0: 25807., 1:2442., 2:5291., 3:873., 4:708.};

model.fit_generator(train_generator,
                   samples_per_epoch=2000,
                   nb_epoch=50, 
                    verbose=2, 
                   callbacks=callbacks_list,
                   class_weight ={0: 25807., 1:2442., 2:5291., 3:873., 4:708.})

My folders with images

Problems:

The model outputs with high loss and high accuracy. Why?

Epoch 1/50 110s - loss: 5147.2669 - acc: 0.7366

Epoch 2/50 105s - loss: 5052.3844 - acc: 0.7302

Epoch 3/50 105s - loss: 5042.0261 - acc: 0.7421

Epoch 4/50 105s - loss: 4986.3544 - acc: 0.7361

Epoch 5/50 105s - loss: 4999.4177 - acc: 0.7361

Every image model predict as '0' class:

datagen_2=ImageDataGenerator(rescale=1./255)

val_generator=datagen_2.flow_from_directory('data/color_validation_images/',
                                         target_size=(299,299),
                                         batch_size=100,
                                           class_mode='categorical')

y_predict=model.predict_generator(val_generator,
                       val_samples=82)


[np.argmax(i) for i in y_predict]

the output of it is:

0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0

without argmax(partly)

array([  9.47651565e-01,   7.30426749e-03,   4.40788604e-02,
          6.25302084e-04,   3.39932943e-04], dtype=float32),
 array([  9.51994598e-01,   6.50278665e-03,   4.07058187e-02,
          5.17037639e-04,   2.79774162e-04], dtype=float32),
 array([  9.49448049e-01,   6.50656316e-03,   4.32702228e-02,
          5.20388770e-04,   2.54814397e-04], dtype=float32),
 array([  9.47873473e-01,   7.13181263e-03,   4.40776311e-02,
          6.00705389e-04,   3.16353660e-04], dtype=float32),
 array([  9.53514516e-01,   6.13699574e-03,   3.96034382e-02,
          4.82603034e-04,   2.62484333e-04], dtype=float32),
....

If I've tried to use class_weight ='auto'. In this case, model showed 'predictable' output:

Epoch 1/50 107s - loss: 0.9036 - acc: 0.7381

Epoch 2/50 104s - loss: 0.9333 - acc: 0.7321

Epoch 3/50 105s - loss: 0.8865 - acc: 0.7351

Epoch 4/50 106s - loss: 0.8978 - acc: 0.7351

Epoch 5/50 105s - loss: 0.9158 - acc: 0.7302

But, it still doesn't work:

severe_DR=plt.imread('data/~huge_data/preprocessed_imgs/3_Severe/99_left.jpeg')
mild_DR=plt.imread('data/~huge_data/preprocessed_imgs/1_Mild/15_left.jpeg')
moderate_DR=plt.imread('data/~huge_data/preprocessed_imgs/2_Moderate/78_right.jpeg')

model.predict(mild_DR.reshape((1,)+x[1].shape))
array([[ 1.,  0.,  0.,  0.,  0.]], dtype=float32)

model.predict(severe_DR.reshape((1,)+x[1].shape))
array([[ 1.,  0.,  0.,  0.,  0.]], dtype=float32)

model.predict(moderate_DR.reshape((1,)+x[1].shape))
array([[ 1.,  0.,  0.,  0.,  0.]], dtype=float32)

What I've done wrong?

After answer of Sergii Gryshkevych, I fixed my model: I've changed class_weight to {0:1, 1:10.57, 2:4.88, 3:29, 4:35} (I divided images in each classes to maximum images (in first class)). Next, I changed metrics to categorical_accuracy. And I inctreased the number of layers in model (like here). So, the output after 5 epochs is:

Epoch 1/5 500/500 [==============================] - 52s - loss: 5.6944 - categorical_accuracy: 0.1840
Epoch 2/5 500/500 [==============================] - 52s - loss: 6.7357 - categorical_accuracy: 0.2040
Epoch 3/5 500/500 [==============================] - 52s - loss: 6.7373 - categorical_accuracy: 0.0800
Epoch 4/5 500/500 [==============================] - 52s - loss: 6.0311 - categorical_accuracy: 0.0180
Epoch 5/5 500/500 [==============================] - 51s - loss: 4.9924 - categorical_accuracy: 0.0560

Is it correct?

Is there any way to make assign a quadratic weighted kappa as metrics in keras?

Keras unsymmetrical data Diabetic Retinopathy Detection

Answers (1)

Related Questions