dubistweltmeister
dubistweltmeister

Reputation: 13

Mixed Precision(Pytorch Autocast) Slows Down the Code

I have RTX 3070. Somehow using autocast slows down my code.

torch.version.cuda prints 11.1, torch.backends.cudnn.version() prints 8005 and my PyTorch version is 1.9.0. I’m using Ubuntu 20.04 with Kernel 5.11.0-25-generic.

That’s the code I’ve been using:

torch.cuda.synchronize()
start = torch.cuda.Event(enable_timing=True)
end = torch.cuda.Event(enable_timing=True)
start.record()
for epoch in range(10):

    running_loss = 0.0
    for i, data in enumerate(trainloader, 0):

        inputs, labels = data

      
        optimizer.zero_grad()
        
        
        with torch.cuda.amp.autocast(): 
             outputs = net(inputs)
             oss = criterion(outputs, labels) 

         scaler.scale(loss).backward() 
         scaler.step(optimizer)
         scaler.update() 

end.record()

torch.cuda.synchronize()

print(start.elapsed_time(end))

Without torch.cuda.amp.autocast(), 1 epoch takes 22 seconds, whereas with autocast() 1 epoch takes 30 seconds.

Upvotes: 0

Views: 1009

Answers (2)

amirhp
amirhp

Reputation: 11

I came across this post because I was trying the same code and seeing slower performance. BTW, to use the GPU, you need to port data into tensor core in each step:

inputs, labels = data[0].to(device), data[1].to(device)

Even I made my network 10 times bigger I did not see the performance.

Something else might be wrong at setup level. I am going to try Pytorch lightening.

Upvotes: 0

dubistweltmeister
dubistweltmeister

Reputation: 13

It turns out, my model was not big enough to utilize mixed precision. When I increased the in/out channels of convolutional layer, it finally worked as expected.

Upvotes: 1

Related Questions