plhn
plhn

Reputation: 5273

Why use clone() of ReLU in torch's inception net?

I'm reading https://github.com/Element-Research/dpnn/blob/master/Inception.lua

You can see tons of clone()'s in this source. like

mlp:add(self.transfer:clone())

self.transfer is nothing more than nn.ReLU().

Then,

  1. Why does this code call activation functions using clone()? Does this only concern memory issues?

  2. I thought that the clone shares parameters. Is this right? If it's right, it means all activations of this inception module share parameters. It looks like nonsense. Do I misunderstand Inception-Net?

Upvotes: 0

Views: 124

Answers (1)

fonfonx
fonfonx

Reputation: 1465

  1. If you don't clone the module self.transfer then all modules transfer in your net mlp will have the same state variables output and gradInput.

Look for example at this toy code

require 'nn'

module = nn.ReLU()

net = nn.Sequential():add(nn.Linear(2,2)):add(module):add(nn.Linear(2,1)):add(module)

input = torch.Tensor(2,2):random()
net:forward(input)

print(net:get(2).output)
print(net:get(4).output)

Both print statements will return the same value. Modifying one of the module outputs will modify the other one. Since we do not want this behavior we have to clone the module. (However in your case, cloning a simple nn.ReLU() is not that useful.)

  1. The documentation says

    If arguments are provided to the clone(...) function it also calls share(...) with those arguments on the cloned module after creating it, hence making a deep copy of this module with some shared parameters.

Therefore if you don't provide any arguments the parameters won't be shared.

Upvotes: 1

Related Questions