How do I share weights across Parallel-streams?

Question

Is there a way to share weights across parallel streams of a torch-model?

For example, I have the following model.

mlp = nn.Sequential();
c = nn.Parallel(1,2)     -- Parallel container will associate a module to each slice of dimension 1
                         -- (row space), and concatenate the outputs over the 2nd dimension.

for i=1,10 do            -- Add 10 Linear+Reshape modules in parallel (input = 3, output = 2x1)
 local t=nn.Sequential()
 t:add(nn.Linear(3,2))   -- Linear module (input = 3, output = 2)
 t:add(nn.Reshape(2,1))  -- Reshape 1D Tensor of size 2 to 2D Tensor of size 2x1
 c:add(t)
end

mlp:add(c)

And now I want to share the weight (including everything, weights, bias, gradients), of the nn.Linear layer above across different numbers of i (so, e.g. nn.Linear(3,2)[1] with nn.Linear(3,2)[9]). What options do I have to share those?

Or is it rather recommended to use a different container/the module-approach?

fonfonx · Accepted Answer

You can create the module that will be repeated:

t = nn.Sequential()
t:add(nn.Linear(3,2))
t:add(nn.Reshape(2,1))

Then you can use the clone function of torch with additional parameters to share the weights (https://github.com/torch/nn/blob/master/doc/module.md#clonemlp)

mlp = nn.Sequential()
c = nn.Parallel(1,2)
for i = 1, 10 do
    c:add(t:clone('weight', 'bias'))
end
mlp:add(c)

How do I share weights across Parallel-streams?

Answers (1)

Related Questions