CNN on small dataset is overfiting

I want to classify pattern on image. My original image shape are 200 000*200 000 i reshape it to 96*96, pattern are still recognizable with human eyes. Pixel value are 0 or 1.

i'm using the following neural network.


train_X, test_X, train_Y, test_Y = train_test_split(cnn_mat, img_bin["Classification"], test_size = 0.2, random_state = 0)

class_weights = class_weight.compute_class_weight('balanced',
                                                 np.unique(train_Y),
                                                 train_Y)


train_Y_one_hot = to_categorical(train_Y)
test_Y_one_hot = to_categorical(test_Y)

train_X,valid_X,train_label,valid_label = train_test_split(train_X, train_Y_one_hot, test_size=0.2, random_state=13)


    model = Sequential()
    model.add(Conv2D(24,kernel_size=3,padding='same',activation='relu',
            input_shape=(96,96,1)))
    model.add(MaxPool2D())
    model.add(Conv2D(48,kernel_size=3,padding='same',activation='relu'))
    model.add(MaxPool2D())
    model.add(Conv2D(64,kernel_size=3,padding='same',activation='relu'))
    model.add(MaxPool2D())
    model.add(Flatten())
    model.add(Dense(128, activation='relu'))
    model.add(Dense(256, activation='relu'))
    model.add(Dense(16, activation='softmax'))
    model.compile(optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"])

train = model.fit(train_X, train_label, batch_size=80,epochs=20,verbose=1,validation_data=(valid_X, valid_label),class_weight=class_weights)

I have already run some experiment to find a "good" number of hidden layer and fully connected layer. it's probably not the most optimal architecture since my computer is slow, i just ran different model once and selected best one with matrix confusion, i didn't use cross validation,I didn't try more complex architecture since my number of data is small, i have read small architecture are the best, is it worth to try more complex architecture?

here the result with 5 and 12 epoch, bach size 80. This is the confusion matrix for my test set

As you can see it's look like i'm overfiting. When i only run 5 epoch, most of the class are assigned to class 0; With more epoch, class 0 is less important but classification is still bad

I added 0.8 dropout after each convolutional layer

e.g

    model.add(Conv2D(48,kernel_size=3,padding='same',activation='relu'))
    model.add(MaxPool2D())
    model.add(Dropout(0.8))
    model.add(Conv2D(64,kernel_size=3,padding='same',activation='relu'))
    model.add(MaxPool2D())
    model.add(Dropout(0.8))

With drop out, 95% of my image are classified in class 0.

I tryed image augmentation; i made rotation of all my training image, still used weighted activation function, result didnt improve. Should i try to augment only class with small number of image? Most of the thing i read says to augment all the dataset...

To resume my question are: Should i try more complex model?

Is it usefull to do image augmentation only on unrepresented class? then should i still use weight class (i guess no)?

Should i have hope to find a "good" model with cnn when we see the size of my dataset?

Upvotes: 0

Answers (2)

pouyan

Reputation: 3439

I think according to the imbalanced data, it is better to create a custom data generator for your model so that each of it's generated data batch, contains at least one sample from each class. And also it is better to use Dropout layer after each dense layer instead of conv layer. For data augmentation it is better to at least use combination of rotate, horizontal flip and vertical flip. there are some other approaches for data augmentation like using GAN network or random pixel replacement. For Gan you can check This SO post

For using Gan as data augmenter you can read This Article. For combination of pixel level augmentation and GAN pixel level data augmentation

Upvotes: 2

Auss

Reputation: 491

What I used - in a different setting - was to upsample my data with ADASYN. This algorithm calculates the amount of new data required to balance your classes, and then takes available data to sample novel examples.

There is an implementation for Python. Otherwise, you also have very little data. SVMs are good performing even with little data. You might want to try them or other image classification algorithms depending where the expected pattern is always at the same position, or varies. Then you could also try the Viola–Jones object detection framework.

Upvotes: 1

CNN on small dataset is overfiting

Answers (2)

Related Questions