Using Caffe memory layer does NOT produce consistent and deterministic results

Question

I am using the Caffe framework for windows (downloaded from here) on a Windows 7 64-bit machine. I am using C++ within Visual Studio Community 2013. I use the pre-trained GoogLeNet model to extract the loss1-fc layer output to use as a feature vector for each image. So far so good.

Recently i tried changing my software for use with video frames. So i changed the first layer from an ImageData layer to a Memory layer, so i can send to Caffe a vector of OpenCV mats instead of the naive approach of writing each frame to disk and sending a file list to caffe.

Now, i noticed i won't get the same results for the same images! When using the ImageData layer, there's no such thing.

I use CPU (no Cudnn, no GPU).

The function i use for feature extraction is the following:

void feature_extraction_pipeline_memory(boost::shared_ptr> feature_extraction_net, vector imgs, vector labels, float** blobFeats, vector blob_names){

    boost::dynamic_pointer_cast>(feature_extraction_net->layers()[0])->AddMatVector(imgs, labels);

    size_t num_mini_batches = imgs.size();
    size_t num_features = blob_names.size();
    int dim_features;
    int batch_size;
    vector*> input_vec;
    vector image_indices(num_features, 0);

    for (size_t batch_index = 0; batch_index < num_mini_batches; ++batch_index) {
        feature_extraction_net->Forward(input_vec);
        for (size_t i = 0; i < num_features; ++i) {
            const boost::shared_ptr> feature_blob =     feature_extraction_net->blob_by_name(blob_names[i]);
            batch_size = feature_blob->num();
            dim_features = feature_blob->count() / batch_size;
            const Dtype* feature_blob_data;
            for (size_t n = 0; n < batch_size; ++n) {
                feature_blob_data = feature_blob->cpu_data() + feature_blob->offset(n);
                for (size_t d = 0; d < dim_features; ++d)
                    blobFeats[i][(image_indices[i] * dim_features) + d] = feature_blob_data[d];

                ++image_indices[i];
            }  // n < batch_size
        }  // i < num_features
    }  // batch_index < num_mini_batches
}

The imgs vector is a vector of mat. The labels is a vector of int, all set to 0. I wrote all images to disk again once they were added to vector. I checked and there is no problem with that. So there is nothing wrong when loading the images. By the way i use OpenCV 3.1.

The memory layer in GoogLeNet prototxt file is declared as follows:

layer {
  name: "data"
  type: "MemoryData"
  top: "data"
  top: "label"
  memory_data_param {
   batch_size: 1
   channels: 3
   height: 227
   width: 227
  }
  transform_param {
    crop_size: 227
    mirror: true
    mean_file: "model_googlenet_mem/imagenet_mean.binaryproto"
  }
  include: { phase: TEST }
}

and is the first layer.

I print the first 10 values for each image. Note that images 0, 1, 2, 3 are the EXACT same file copied and the same holds for 6, 7 and 8 images.

1st run:
0.jpg ::  3.149, 0.000, 0.000, 0.000, 1.586, 0.000, 0.000, 0.755, 0.000, 4.749,
1.jpg ::  2.680, 0.000, 0.000, 0.560, 0.970, 0.000, 0.000, 1.083, 0.000, 4.420,
2.jpg ::  2.680, 0.000, 0.000, 0.560, 0.970, 0.000, 0.000, 1.083, 0.000, 4.420,
3.jpg ::  2.680, 0.000, 0.000, 0.560, 0.970, 0.000, 0.000, 1.083, 0.000, 4.420,
4.jpg ::  3.957, 0.000, 0.000, 0.000, 0.868, 0.000, 0.000, 0.000, 0.000, 6.396,
5.jpg ::  3.179, 0.000, 0.000, 0.000, 0.906, 0.000, 0.000, 0.000, 0.000, 5.508,
6.jpg ::  4.951, 0.000, 0.000, 0.000, 0.000, 0.343, 2.993, 0.000, 0.000, 0.000,
7.jpg ::  4.567, 0.000, 0.000, 0.000, 0.000, 1.251, 2.446, 0.000, 0.000, 0.000,
8.jpg ::  4.951, 0.000, 0.000, 0.000, 0.000, 0.343, 2.993, 0.000, 0.000, 0.000,
9.jpg ::  5.678, 0.000, 0.000, 2.010, 0.000, 1.064, 2.412, 0.000, 0.000, 0.000,

2nd run:

0.jpg ::  2.680, 0.000, 0.000, 0.560, 0.970, 0.000, 0.000, 1.083, 0.000, 4.420,
1.jpg ::  2.680, 0.000, 0.000, 0.560, 0.970, 0.000, 0.000, 1.083, 0.000, 4.420,
2.jpg ::  3.149, 0.000, 0.000, 0.000, 1.586, 0.000, 0.000, 0.755, 0.000, 4.749,
3.jpg ::  2.680, 0.000, 0.000, 0.560, 0.970, 0.000, 0.000, 1.083, 0.000, 4.420,
4.jpg ::  3.957, 0.000, 0.000, 0.000, 0.868, 0.000, 0.000, 0.000, 0.000, 6.396,
5.jpg ::  2.928, 0.000, 0.000, 0.000, 0.769, 0.000, 0.000, 0.000, 0.000, 5.552,
6.jpg ::  4.567, 0.000, 0.000, 0.000, 0.000, 1.251, 2.446, 0.000, 0.000, 0.000,
7.jpg ::  4.567, 0.000, 0.000, 0.000, 0.000, 1.251, 2.446, 0.000, 0.000, 0.000,
8.jpg ::  4.951, 0.000, 0.000, 0.000, 0.000, 0.343, 2.993, 0.000, 0.000, 0.000,
9.jpg ::  5.678, 0.000, 0.000, 2.010, 0.000, 1.064, 2.412, 0.000, 0.000, 0.000,

The layers output is different for the same images and different for different runs! When using the same procedure with the ImageData layer there is no such problem. Also, the problem holds for the output of other layers too, for example loss3/classifier. So, i suspect there might be a bug within the MemoryLayer implementation.

Has anyone noticed this strange behaviour? I read that cudnn might produce non-deterministic results but i ran my model on CPU. Any thoughts on this are welcome.

Using Caffe memory layer does NOT produce consistent and deterministic results

Answers (1)

Related Questions