Samuel Mideksa
couldn't train a model in keras

I am trying to train a All-in-one convolution model for face analysis in keras using aflw dataset which is about 19.2 GB in size. It successfully displayed model summary but it couldn't train the model.

I have a computer with RAM about 4 GB.

Loading pickle files
Loaded train, test and validation dataset
Loading test images
Loading validation images
dataset/ FutureWarning: Method .as_matrix will be removed in a future version. Use .values instead.
  self.test_detection = self.test_dataset["is_face"].as_matrix()
Loaded all dataset and images
Layer (type)                    Output Shape         Param #     Connected to                     
input_1 (InputLayer)            (None, 227, 227, 1)  0                                            
conv2d_1 (Conv2D)               (None, 55, 55, 96)   11712       input_1[0][0]                    
batch_normalization_1 (BatchNor (None, 55, 55, 96)   384         conv2d_1[0][0]                   
max_pooling2d_1 (MaxPooling2D)  (None, 27, 27, 96)   0           batch_normalization_1[0][0]      
conv2d_2 (Conv2D)               (None, 27, 27, 256)  614656      max_pooling2d_1[0][0]            
batch_normalization_2 (BatchNor (None, 27, 27, 256)  1024        conv2d_2[0][0]                   
max_pooling2d_2 (MaxPooling2D)  (None, 13, 13, 256)  0           batch_normalization_2[0][0]      
conv2d_3 (Conv2D)               (None, 13, 13, 384)  885120      max_pooling2d_2[0][0]            
conv2d_4 (Conv2D)               (None, 13, 13, 384)  1327488     conv2d_3[0][0]                   
conv2d_5 (Conv2D)               (None, 13, 13, 512)  1769984     conv2d_4[0][0]                   
conv2d_8 (Conv2D)               (None, 6, 6, 256)    393472      max_pooling2d_1[0][0]            
conv2d_9 (Conv2D)               (None, 6, 6, 256)    393472      conv2d_3[0][0]                   
max_pooling2d_4 (MaxPooling2D)  (None, 6, 6, 512)    0           conv2d_5[0][0]                   
concatenate_1 (Concatenate)     (None, 6, 6, 1024)   0           conv2d_8[0][0]                   
conv2d_10 (Conv2D)              (None, 6, 6, 256)    262400      concatenate_1[0][0]              
flatten_2 (Flatten)             (None, 9216)         0           conv2d_10[0][0]                  
dense_3 (Dense)                 (None, 2048)         18876416    flatten_2[0][0]                  
dropout_3 (Dropout)             (None, 2048)         0           dense_3[0][0]                    
dense_11 (Dense)                (None, 512)          1049088     dropout_3[0][0]                  
dropout_10 (Dropout)            (None, 512)          0           dense_11[0][0]                   
detection_probablity (Dense)    (None, 2)            1026        dropout_10[0][0]                 
Total params: 25,586,242
Trainable params: 25,585,538
Non-trainable params: 704
Epoch 1/10

It says Epoch 1/10 but It stops. Is it a problem with my computer's computational problem?

Jeremy Bare
If it starts running like that then it probably has enough ram to run properly. You can check your resource monitor to see how much memory is available. You can also check to see if there is any CPU usage. If there is CPU usage then it is probably just training very slowly.

That is a fairly large model so it could take an extremely long time to train on a small CPU.

Make sure your Keras verbosity is set to 1 so that it prints information every batch. Although that is the default so it should already be set that way unless you changed it.

Try also turning down the batch size to size 1 and see if you get any output (since it should complete the smaller batch faster).

If it is running properly but slowly your best bet is get a GPU to run it on. If you can't do that then you can try to compile Tensorflow from source in order to make sure you have all the CPU instruction sets and the MKL library if you want which could speed it up some.

