Caffe: variable input-image size

Question

I am trying out Google's deepdream code which makes use of Caffe. They use the GoogLeNet model pre-trained on ImageNet, as provided by the ModelZoo. That means the network was trained on images cropped to the size 224x224 pixel. From the train_val.prototext:

layer {            
  name: "data"     
  type: "Data"     
  ...

  transform_param {
     mirror: true   
     crop_size: 224
  ...

The deploy.prototext used for processing also defines an input layer with the size of 224x224x3x10 (RGB images of size 224x224, batchsize 10).

name: "GoogleNet"
input: "data"
input_shape {
  dim: 10
  dim: 3
  dim: 224
  dim: 224
}

However I can use this net to process images of any size (the example above used one of 1024x574 pixel).

deploy.prototext does not configure caffe to use cropping.
The preprocessing in the deepdream code only does demeaning, also no cropping here

How is it possible that I can run on images which are too big for the input layer?

complete code can be found here

Shai · Accepted Answer

DeepDream does not crop the input image at all.
If you pay close attention you'll notice that it operates on a mid-level layer: it's end= argument is set to 'inception_4c/output' or end='inception_3b/5x5_reduce', but NEVER end='loss3/classifier'. The reason for this is that the GoogLeNet up to these layers is a fully-convolutional net, that is, it can take any sized input image and produce outputs of sizes proportional to the input size (output size is usually affected by conv padding and pooling).

To adjust the net to the different sizes of inputs the function deepdream has the line

src.reshape(1,3,h,w) # resize the network's input image size

This line adjusts the net's layers to accommodate input of shape (1,3,h,w).

Caffe: variable input-image size

Answers (1)

Related Questions