Reputation: 24721
I was checking out this video where Phil points out to this fact that using torch.nn.Sequential
is faster than not using it. I did quick google and came across this post which is not answered satisfactorily, so I am replicating it here.
Here is the code from the post with Sequential
:
class net2(nn.Module):
def __init__(self):
super(net2, self).__init__()
self.conv1 = nn.Sequential(
nn.Conv2d(
in_channels=1,
out_channels=16,
kernel_size=5,
stride=1,
padding=2,
),
nn.ReLU(),
nn.MaxPool2d(kernel_size=2),
)
self.conv2 = nn.Sequential(
nn.Conv2d(16, 32, 5, 1, 2),
nn.ReLU(),
nn.MaxPool2d(2),
)
self.out = nn.Linear(32 * 7 * 7, 10)
def forward(self, x):
x = self.conv1(x)
x = self.conv2(x)
x = x.view(x.size(0), -1)
output = self.out(x)
return output
And here is the net without Sequential
:
class net1(nn.Module):
def __init__(self):
super(net1, self).__init__()
self.conv1 = nn.Conv2d(1,16,5,1,2)
self.conv2 = nn.Conv2d(16,32,5,1,2)
self.pool1 = nn.MaxPool2d(2)
self.pool2 = nn.MaxPool2d(2)
self.fc1 = nn.Linear(32*7*7,10)
def forward(self, x):
out = F.relu(self.conv1(x))
out = self.pool1(out)
out = F.relu(self.conv2(out))
out = self.pool2(out)
out = out.view(-1,32*7*7)
out = F.relu(self.fc1(out))
return out
In the comment, the author also states that the accuracy is better with Sequential
than without it. One of the comment states that speed is more with Sequential
because loop unrolling results in faster execution. So here are my related questions:
Is loop unrolling only reason why Sequntial
implementation faster than the other one?
Can someone explain how non-Sequential
does not result in loop unrolling and Sequntial
results in loop-unrolling, perhaps by pointing me to pytorch source lines on github
Does Sequential
really result in more accuracy, if yes why?
Upvotes: 1
Views: 1613
Reputation: 13601
Note that these 2 architectures are different. One has a ReLU right before the output (net1
, F.relu(self.fc1(out))
) and the other doesn't (net2
, self.out(x)
). This might explain why one has a different accuracy than the other.
With regards to speed, I don't think there should be any difference. ReLU is often negligible w.r.t. speed. Given that you didn't state the speed difference, it is difficult to say what could be the reason.
Upvotes: 2
Reputation: 11628
I'm not familiar with what kind of optimizations the python interpreter does, but I'd guess it is very limited.
But the claim that one method is more accurate than another is completely nonsense. If you look at the implementation of nn.Sequential you'll see that it does exactly the same thing you'd do anyway, you just iterate over the modules and pass the output of one to the input of the next:
def forward(self, input):
for module in self:
input = module(input)
return input
There might be a tiny difference in speed due to the overhead of all the additional things that nn.Sequential
might do, but this is negligible.
Upvotes: 2