Reputation: 99
I have a very simple optimization: a straight line. Here is the code:
use_gpu = torch.cuda.is_available()
learning_rate = 0.05
loss_function = nn.MSELoss()
train_inputs = torch.FloatTensor([1,2,3,4,5,6]).T.unsqueeze(0)
y_truth = torch.FloatTensor([10, 15, 20, 25, 30, 35]).unsqueeze(0)
W = torch.nn.Parameter(torch.rand(1), requires_grad=True)
b = torch.nn.Parameter(torch.rand(1), requires_grad=True)
optimizer = optim.Adam([b, W], lr=learning_rate)
# if use_gpu:
# y_truth = y_truth.cuda()
# W = W.cuda()
# b = b.cuda()
# train_inputs = train_inputs.cuda()
for epoch in range(1000):
optimizer.zero_grad()
y_preds = b + W * train_inputs
loss = loss_function(y_truth, y_preds)
loss.backward()
optimizer.step()
if epoch % 100 == 0:
print(loss.data, W.data, b.data)
That code works fine if I do not put the data on the GPU. If I uncomment the if use_gpu
bloc, the code runs, but does not minimize anything and the variables do not update.
I would expect the code to work similarly on the GPU or not. Any idea what is happening?
Thanks!
Upvotes: 0
Views: 1001
Reputation: 12567
Any idea what is happening?
Yes, the parameters you are training, W
and b
stayed on the host (CPU).
When you did
W = W.cuda()
b = b.cuda()
you just chose to ignore the actual parameters being optimized.
If you wish to use the GPU for this, you could try:
W = torch.nn.Parameter(torch.rand(1).cuda())
b = torch.nn.Parameter(torch.rand(1).cuda())
instead.
Upvotes: 3