Reputation: 2287
I am trying out Google's deepdream code which makes use of Caffe. They use the GoogLeNet model pre-trained on ImageNet, as provided by the ModelZoo. That means the network was trained on images cropped to the size 224x224 pixel. From the train_val.prototext:
layer {
name: "data"
type: "Data"
...
transform_param {
mirror: true
crop_size: 224
...
The deploy.prototext used for processing also defines an input layer with the size of 224x224x3x10 (RGB images of size 224x224, batchsize 10).
name: "GoogleNet"
input: "data"
input_shape {
dim: 10
dim: 3
dim: 224
dim: 224
}
However I can use this net to process images of any size (the example above used one of 1024x574 pixel).
How is it possible that I can run on images which are too big for the input layer?
complete code can be found here
Upvotes: 4
Views: 7192
Reputation: 114796
DeepDream does not crop the input image at all.
If you pay close attention you'll notice that it operates on a mid-level layer: it's end=
argument is set to 'inception_4c/output'
or end='inception_3b/5x5_reduce'
, but NEVER end='loss3/classifier'
. The reason for this is that the GoogLeNet up to these layers is a fully-convolutional net, that is, it can take any sized input image and produce outputs of sizes proportional to the input size (output size is usually affected by conv padding and pooling).
To adjust the net to the different sizes of inputs the function deepdream
has the line
src.reshape(1,3,h,w) # resize the network's input image size
This line adjusts the net's layers to accommodate input of shape (1,3,h,w)
.
Upvotes: 4