Reputation: 315
Im working on converting and old project written in tensorflow v1.13 to pytorch v1.4.0 when I noticed that tensorflow and pytorch had different size weight tensors for the 2d cnns.
Here is my tensorflow code
cnn = tf.layers.conv2d(img_tensor, 16, (3, 3), (1, 1), padding='SAME', name='cnn_1')
cnn = tf.layers.conv2d(cnn, 32, (3, 3), (1, 1), padding='SAME', name='cnn_2')
init = tf.global_varaibles_initializer()
with tf.Session() as sess:
sess.run(init)
vars = {v.name:v for v in tf.trainable_variables()}
print(sess.run(vars['cnn_2/kernel:0']).shape)
Result
(3, 3, 1, 32)
Here is my pytorch code
class Net(Module):
def __init__(self):
super(Net, self).__init__()
self.create_cnn()
def create_cnn(self):
self.cnn_layers = Sequential(
Conv2d(1,16,3,padding=1)
Conv2d(16,32,3,padding=1)
)
def forward(self, x):
return self.cnn_layers(x)
def weights_init(m):
if type(m) == Conv2d:
if(m.bias.shape[0] == 32):
print(m.weight.data.shape)
model = Net()
model.apply(weights_init)
Result
torch.Size([32,16,3,3])
The reason this came up was because my pytorch model is not working so I started going a layer at a time and comparing outputs between tensorflow and pytorch. In order for that to work I had to set the weights on both models to the same values. Well I got the 2nd cnn layer and was confused when it failed to set the weights because the size was wrong. A little bit of poking around and I found this difference.
I looks like tensorflow is using the same kernel across all the channels where pytorch has a unique kernel for each channel. If this is the case, how can I replicate this in pytorch?
Upvotes: 1
Views: 563
Reputation: 315
After re-reading the pytorch docs I noticed that the groups property is exactly related to this. That'll teach me not to skim over parts of the docs. By setting groups=in_channels I now get the size (32, 1, 3, 3) as desired.
Edit: So even more embarrassing, in my test code I was feeding my inputs into both cnn layers instead of daisy chaining them. When I actually run the code as written above the second cnn in tensorflow does infact have weights with size (3, 3, 16, 32).
But at least I learned about grouping.
Upvotes: 1