Pytorch: using CUDA prevents optimization from working

Question

I have a very simple optimization: a straight line. Here is the code:

use_gpu = torch.cuda.is_available()
learning_rate = 0.05
loss_function = nn.MSELoss()

train_inputs = torch.FloatTensor([1,2,3,4,5,6]).T.unsqueeze(0)
y_truth = torch.FloatTensor([10, 15, 20, 25, 30, 35]).unsqueeze(0)

W = torch.nn.Parameter(torch.rand(1), requires_grad=True)
b = torch.nn.Parameter(torch.rand(1), requires_grad=True)
optimizer = optim.Adam([b, W], lr=learning_rate)

# if use_gpu:
#     y_truth = y_truth.cuda()
#     W = W.cuda()
#     b = b.cuda()
#     train_inputs = train_inputs.cuda()



for epoch in range(1000):
    optimizer.zero_grad()
    y_preds = b + W * train_inputs
    loss = loss_function(y_truth, y_preds)
    loss.backward()
    optimizer.step()

    if epoch % 100 == 0:
        print(loss.data, W.data, b.data)

That code works fine if I do not put the data on the GPU. If I uncomment the if use_gpu bloc, the code runs, but does not minimize anything and the variables do not update.

I would expect the code to work similarly on the GPU or not. Any idea what is happening?

Thanks!

MWB · Accepted Answer

Any idea what is happening?

Yes, the parameters you are training, W and b stayed on the host (CPU).

When you did

W = W.cuda()
b = b.cuda()

you just chose to ignore the actual parameters being optimized.

If you wish to use the GPU for this, you could try:

W = torch.nn.Parameter(torch.rand(1).cuda())
b = torch.nn.Parameter(torch.rand(1).cuda())

instead.

Pytorch: using CUDA prevents optimization from working

Answers (1)

Related Questions