Reputation: 2155
Is there a way to share weights across parallel streams of a torch-model?
For example, I have the following model.
mlp = nn.Sequential();
c = nn.Parallel(1,2) -- Parallel container will associate a module to each slice of dimension 1
-- (row space), and concatenate the outputs over the 2nd dimension.
for i=1,10 do -- Add 10 Linear+Reshape modules in parallel (input = 3, output = 2x1)
local t=nn.Sequential()
t:add(nn.Linear(3,2)) -- Linear module (input = 3, output = 2)
t:add(nn.Reshape(2,1)) -- Reshape 1D Tensor of size 2 to 2D Tensor of size 2x1
c:add(t)
end
mlp:add(c)
And now I want to share the weight (including everything, weights, bias, gradients), of the nn.Linear
layer above across different numbers of i
(so, e.g. nn.Linear(3,2)[1]
with nn.Linear(3,2)[9]
). What options do I have to share those?
Or is it rather recommended to use a different container/the module-approach?
Upvotes: 0
Views: 136
Reputation: 1465
You can create the module that will be repeated:
t = nn.Sequential()
t:add(nn.Linear(3,2))
t:add(nn.Reshape(2,1))
Then you can use the clone
function of torch with additional parameters to share the weights (https://github.com/torch/nn/blob/master/doc/module.md#clonemlp)
mlp = nn.Sequential()
c = nn.Parallel(1,2)
for i = 1, 10 do
c:add(t:clone('weight', 'bias'))
end
mlp:add(c)
Upvotes: 1