Thijser
Thijser

Reputation: 2633

torch7 size mismatch when feeding an image

I'm trying to do some stuff with a neural network in torch7. However when I run the code I get the error /home/thijser/torch/install/share/lua/5.1/nn/Linear.lua:57: size mismatch at /tmp/luarocks_cutorch-scm-1-6477/cutorch/lib/THC/generic/THCTensorMathBlas.cu:52

here is the code (or at least the minimal example where the problem occurs)

require 'torch'
require 'nn'
require 'image'
require 'optim'
require 'cutorch'
require 'cunn'

require 'loadcaffe'
local cmd = torch.CmdLine()
local function main(params)
  cutorch.setDevice(1)
  local loadcaffe_backend = 'nn'
  local cnn = loadcaffe.load('models/VGG_ILSVRC_19_layers-deploy.prototxt', 'models/VGG_ILSVRC_19_layers.caffemodel', loadcaffe_backend):float()
  cnn:cuda()
  targetImage_caffe = image.load('tank.jpg', 3)
  targetImage_caffe = targetImage_caffe:cuda() 
 netimage=cnn:forward(targetImage_caffe)

end

local params = cmd:parse(arg)
main(params)

And the full error log

/home/thijser/torch/install/bin/luajit: /home/thijser/torch/install/share/lua/5.1/nn/Container.lua:67: 
In 39 module of nn.Sequential:
/home/thijser/torch/install/share/lua/5.1/nn/Linear.lua:57: size mismatch at /tmp/luarocks_cutorch-scm-1-6477/cutorch/lib/THC/generic/THCTensorMathBlas.cu:52
stack traceback:
    [C]: in function 'addmv'
    /home/thijser/torch/install/share/lua/5.1/nn/Linear.lua:57: in function </home/thijser/torch/install/share/lua/5.1/nn/Linear.lua:53>
    [C]: in function 'xpcall'
    /home/thijser/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
    /home/thijser/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
    temp.lua:24: in function 'main'
    temp.lua:37: in main chunk
    [C]: in function 'dofile'
    ...jser/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
    [C]: at 0x5599d0cfa470

WARNING: If you see a stack trace below, it doesn't point to the place where this error occurred. Please use only the one above.
stack traceback:
    [C]: in function 'error'
    /home/thijser/torch/install/share/lua/5.1/nn/Container.lua:67: in function 'rethrowErrors'
    /home/thijser/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
    temp.lua:24: in function 'main'
    temp.lua:37: in main chunk
    [C]: in function 'dofile'
    ...jser/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
    [C]: at 0x5599d0cfa470

The models can be downloaded by

cd models
wget -c https://gist.githubusercontent.com/ksimonyan/3785162f95cd2d5fee77/raw/bb2b4fe0a9bb0669211cf3d0bc949dfdda173e9e/VGG_ILSVRC_19_layers_deploy.prototxt
wget -c --no-check-certificate https://bethgelab.org/media/uploads/deeptextures/vgg_normalised.caffemodel
wget -c http://www.robots.ox.ac.uk/~vgg/software/very_deep/caffe/VGG_ILSVRC_19_layers.caffemodel
cd ..

print(cnn) gives an output of

nn.Sequential {
  [input -> (1) -> (2) -> (3) -> (4) -> (5) -> (6) -> (7) -> (8) -> (9) -> (10) -> (11) -> (12) -> (13) -> (14) -> (15) -> (16) -> (17) -> (18) -> (19) -> (20) -> (21) -> (22) -> (23) -> (24) -> (25) -> (26) -> (27) -> (28) -> (29) -> (30) -> (31) -> (32) -> (33) -> (34) -> (35) -> (36) -> (37) -> (38) -> (39) -> (40) -> (41) -> (42) -> (43) -> (44) -> (45) -> (46) -> output]
  (1): nn.SpatialConvolution(3 -> 64, 3x3, 1,1, 1,1)
  (2): nn.ReLU
  (3): nn.SpatialConvolution(64 -> 64, 3x3, 1,1, 1,1)
  (4): nn.ReLU
  (5): nn.SpatialMaxPooling(2x2, 2,2)
  (6): nn.SpatialConvolution(64 -> 128, 3x3, 1,1, 1,1)
  (7): nn.ReLU
  (8): nn.SpatialConvolution(128 -> 128, 3x3, 1,1, 1,1)
  (9): nn.ReLU
  (10): nn.SpatialMaxPooling(2x2, 2,2)
  (11): nn.SpatialConvolution(128 -> 256, 3x3, 1,1, 1,1)
  (12): nn.ReLU
  (13): nn.SpatialConvolution(256 -> 256, 3x3, 1,1, 1,1)
  (14): nn.ReLU
  (15): nn.SpatialConvolution(256 -> 256, 3x3, 1,1, 1,1)
  (16): nn.ReLU
  (17): nn.SpatialConvolution(256 -> 256, 3x3, 1,1, 1,1)
  (18): nn.ReLU
  (19): nn.SpatialMaxPooling(2x2, 2,2)
  (20): nn.SpatialConvolution(256 -> 512, 3x3, 1,1, 1,1)
  (21): nn.ReLU
  (22): nn.SpatialConvolution(512 -> 512, 3x3, 1,1, 1,1)
  (23): nn.ReLU
  (24): nn.SpatialConvolution(512 -> 512, 3x3, 1,1, 1,1)
  (25): nn.ReLU
  (26): nn.SpatialConvolution(512 -> 512, 3x3, 1,1, 1,1)
  (27): nn.ReLU
  (28): nn.SpatialMaxPooling(2x2, 2,2)
  (29): nn.SpatialConvolution(512 -> 512, 3x3, 1,1, 1,1)
  (30): nn.ReLU
  (31): nn.SpatialConvolution(512 -> 512, 3x3, 1,1, 1,1)
  (32): nn.ReLU
  (33): nn.SpatialConvolution(512 -> 512, 3x3, 1,1, 1,1)
  (34): nn.ReLU
  (35): nn.SpatialConvolution(512 -> 512, 3x3, 1,1, 1,1)
  (36): nn.ReLU
  (37): nn.SpatialMaxPooling(2x2, 2,2)
  (38): nn.View(-1)
  (39): nn.Linear(25088 -> 4096)
  (40): nn.ReLU
  (41): nn.Dropout(0.500000)
  (42): nn.Linear(4096 -> 4096)
  (43): nn.ReLU
  (44): nn.Dropout(0.500000)
  (45): nn.Linear(4096 -> 1000)
  (46): nn.SoftMax
}

While print(targetImage_caffe:size()) gives me

    3
  660
 1045
[torch.LongStorage of size 3]

Anybody know how to fix this or what I'm doing wrong?

Upvotes: 0

Views: 260

Answers (1)

fonfonx
fonfonx

Reputation: 1465

The problem comes fro the fact that you are using VGG19 which is designed to be fed with 224 x 224 images. Since you are using a 660 x 1045 image (which is typically strange since most of the convnets use squared images) an error occur at module 39 (you can see it in the stack trace) because you want to aply a linear module with 25088 input dimensions to a tensor which has now around 327680 values (each pooling layer roughly divide the image's size by 4 and you have 512 features maps).

The solution is therefore to use 224 x 224 images. Therefore after the 5 pooling layers you will have a image of dimension (224 / 2^5) x (224 / 2^5) x 512 = 25088.

Upvotes: 1

Related Questions