Mixed Precision(Pytorch Autocast) Slows Down the Code

Question

I have RTX 3070. Somehow using autocast slows down my code.

torch.version.cuda prints 11.1, torch.backends.cudnn.version() prints 8005 and my PyTorch version is 1.9.0. I’m using Ubuntu 20.04 with Kernel 5.11.0-25-generic.

That’s the code I’ve been using:

torch.cuda.synchronize()
start = torch.cuda.Event(enable_timing=True)
end = torch.cuda.Event(enable_timing=True)
start.record()
for epoch in range(10):

    running_loss = 0.0
    for i, data in enumerate(trainloader, 0):

        inputs, labels = data

      
        optimizer.zero_grad()
        
        
        with torch.cuda.amp.autocast(): 
             outputs = net(inputs)
             oss = criterion(outputs, labels) 

         scaler.scale(loss).backward() 
         scaler.step(optimizer)
         scaler.update() 

end.record()

torch.cuda.synchronize()

print(start.elapsed_time(end))

Without torch.cuda.amp.autocast(), 1 epoch takes 22 seconds, whereas with autocast() 1 epoch takes 30 seconds.

dubistweltmeister · Accepted Answer

It turns out, my model was not big enough to utilize mixed precision. When I increased the in/out channels of convolutional layer, it finally worked as expected.

Mixed Precision(Pytorch Autocast) Slows Down the Code

Answers (2)

Related Questions