Reputation: 2133
I generated an LMDB database using the SSD-Caffe fork here. I have successfully generated the VOC LMDB trainval/test LMDB directories and am able to train the model.
However, during training, it takes inordinatly long to load data from the LMDB database. For example when profiling using Caffe's time
function using this command:
ssdcaffe time --model "jobs/VGGNet/VOC0712/SSD_300x300/train.prototxt" --gpu 0 --iterations 20
I get that the forward pass takes on average 8.9s, and the backward pass takes on average 0.5s. On a layer-by-layer inspection, the data injestion layer takes the bulk of that time at 8.7s. See below:
I1129 10:14:11.094445 8011 caffe.cpp:404] data forward: 8660.38 ms.
...
I1129 10:14:11.095383 8011 caffe.cpp:412] Average Forward pass: 8933.31 ms.
I1129 10:14:11.095389 8011 caffe.cpp:414] Average Backward pass: 519.549 ms.
If I half the batchsize from 32 to 16, then the data injestion layer time decreases roughly in half:
I1129 10:20:07.975527 8093 caffe.cpp:404] data forward: 3906.53 ms.
This is clearly not the intended speed, and something is wrong. Any help would be greatly appreciated!
Upvotes: 1
Views: 332
Reputation: 2133
Found my issue:
My images were too big. The standard VOC images which the repo used were ~350x500 pixels, whereas my images were 1080x1920. When I downsized my images by 3x (eg 9x less pixels), my data ingestion layer took only 181ms (a 48x speedup over previous time of 8.6s)
Upvotes: 0