Reputation: 79
/pytorch/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [18,0,0], thread: [54,0,0] Assertion input_val >= zero && input_val <= one
failed.
/pytorch/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [18,0,0], thread: [55,0,0] Assertion input_val >= zero && input_val <= one
failed.
/pytorch/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [18,0,0], thread: [56,0,0] Assertion input_val >= zero && input_val <= one
failed.
/pytorch/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [18,0,0], thread: [57,0,0] Assertion input_val >= zero && input_val <= one
failed.
/pytorch/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [18,0,0], thread: [58,0,0] Assertion input_val >= zero && input_val <= one
failed.
/pytorch/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [18,0,0], thread: [59,0,0] Assertion input_val >= zero && input_val <= one
failed.
Traceback (most recent call last):
File "run_toys.py", line 215, in
loss = criterion(torch.reshape(out, [-1, dataset.out_dim]), torch.reshape(target, [-1, dataset.out_dim]))
File "/usr/local/python3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/usr/local/python3/lib/python3.6/site-packages/torch/nn/modules/loss.py", line 530, in forward
return F.binary_cross_entropy(input, target, weight=self.weight, reduction=self.reduction)
File "/usr/local/python3/lib/python3.6/site-packages/torch/nn/functional.py", line 2526, in
binary_cross_entropy
input, target, weight, reduction_enum)
RuntimeError: CUDA error: device-side assert triggered
The Code
criterion = nn.CrossEntropyLoss()
loss = criterion(torch.reshape(out, [-1, dataset.out_dim]), torch.reshape(target, [-1, dataset.out_dim]))
loss = torch.mean(loss)
The shape of the target and output is the same # torch.Size([640, 32])
The model runs on my CPU OK, but running on GPU is the issue
Upvotes: 4
Views: 10619
Reputation: 316
There might be two reasons of the error:
input_val
is not between the range [0; 1]. So you should ensure that model outputs are in that range. You can use torch.clamp()
of pytorch. Before calculating the loss add the following line: out = out.clamp(0, 1)
nan
values which triggers assert as well. To prevent this you can use the following trick, again before calculating the loss: out[out!=out] = 0 # or 1 depending on your model's need
Here the trick is using nan!=nan
property, we should change them to some valid number.
Upvotes: 7