Reputation: 1711
I'm trying to use the MINST Caffe example via the C++ API, but I'm having a bit of trouble working out how to restructure the network prototxt file I'll deploy after training. I've trained and tested the model with the original file (lenet_train_test.prototxt), but when I want to deploy it and make predictions like in the C++ and OpenCV example, I realise I have to modify the input section to make it similar to the deploy.prototxt file they have.
Can I replace the information in the training and testing layers of the lenet_train_test.prototxt with this section of the deploy.prototxt file?
name: "CaffeNet"
input: "data"
input_shape {
dim: 10
dim: 3
dim: 227
dim: 227
}
The images I'll be passing for classification to the network will be grayscale and 24*24 pixels, and I'll also want to scale it like was done with the MINST dataset, so could I modify the section to this?
name: "CaffeNet"
input: "data"
input_shape {
dim: 10
dim: 1
dim: 24
dim: 24
}
transform_param {
scale: 0.00390625
}
I'm not entirely sure what the "dim: 10
" is coming from though.
Upvotes: 1
Views: 2985
Reputation: 114786
In order to "convert" you train_val prototxt to a deploy one you remove the input data layers (reading your train/val data) and replacing them with the declaration
name: "CaffeNet"
input: "data"
input_shape {
dim: 10
dim: 1
dim: 24
dim: 24
}
Note that the deploy prototxt does not have two phases for train and test only a single flavor.
Replacing the input data layer with this declaration basically tells caffe that you are responsible of supplying the data, and the net should allocate space for inputs of this size.
Regarding scale: once you deploy your net, the net has no control over the inputs - it does not read the data for you as the input data layers in the train_val net. Therefore, you'll have to scale the input data yourself before feeding it to the network. You can use the DataTransformer class to help you transform your input blobs in the same way they were transformed during training.
Regarding the first dim: 10
: every Blob (i.e., data/parameters storage unit) in caffe has 4 dimensions: batch-size, channels, height and width. This parameter actually means the net should allocate space for batches of 10 inputs at a time.
The "magic" number 10 comes from the way googlenet and other competitors in ILSVRC challenge used to classify images: they classified 10 crops from each image and averaged the outputs to produce better classification results.
Upvotes: 4