torch testJacobian works with DoubleTensor, not with CudaTensor

Question

I'm working on a neural network using this module: https://github.com/qassemoquab/stnbhwd/blob/master/AffineGridGeneratorBHWD.lua

nn.Jacobian.testJacobian is large when I run the module as :cuda() with a CudaTensor input, but not when I run it as :double() with the same DoubleTensor input

The :forward for a :double() run and a :cuda() run are very close.

The :backward for a :double() run and a :cuda() run are wildly different, so the problem is somewhere in the updateGradInput method, I think:

function AGG:updateGradInput(_transformMatrix, _gradGrid)
   local transformMatrix, gradGrid
   if _transformMatrix:nDimension()==2 then
      transformMatrix = addOuterDim(_transformMatrix)
      gradGrid = addOuterDim(_gradGrid)
   else
      transformMatrix = _transformMatrix
      gradGrid = _gradGrid
   end

   local batchsize = transformMatrix:size(1)
   local flattenedGradGrid = gradGrid:view(batchsize, self.width*self.height, 2)
   local flattenedBatchGrid = self.batchGrid:view(batchsize, self.width*self.height, 3)
   self.gradInput:resizeAs(transformMatrix):zero()
   self.gradInput:bmm(flattenedGradGrid:transpose(2,3), flattenedBatchGrid)

   if _transformMatrix:nDimension()==2 then
      self.gradInput = self.gradInput:select(1,1)
   end

   return self.gradInput
end

The bmm method was the only one I wasn't super familiar with, so I tested it out, and it gives comparable results with :double() and :cuda().

Anybody have any experience with similar problems? I'll continue trying to track down the problem in the meantime...

EDIT To illustrate the extent of the discrepancy:

th> input = torch.randn(5,2,3)
th> cuda_input = torch.CudaTensor(5,2,3):copy(input)
th> module = nn.AffineGridGeneratorBHWD(50,50)
th> nn.Jacobian.testJacobian(module:double(), input)
5.9742433222709e-10
th> nn.Jacobian.testJacobian(module:cuda(), cuda_input)
0.31908118724823

It is possible that I'm mistaken about the problem being in the updateGradInput method.... Still poking about.

smhx · Accepted Answer

This is expected (and not wrong or a bug). Jacobian tests need a precision that's high enough and Float preicison (CudaTensor) doesn't give you that.

torch testJacobian works with DoubleTensor, not with CudaTensor

Answers (1)

Related Questions