Reputation: 63
I am using PyTorch to simulate NNs on a quantum computer, and therefore I have to use tensors with ComplexFloatTensor datatypes. When I run this line of code on GPU:
torch.matmul(A.transpose(1,2).flatten(0,1), H.flatten(1,2)).reshape(N,steps,2**n,2**n).transpose(0,1)
I get the following error when the tensors are LARGE:
RuntimeError: CUDA error: CUBLAS_STATUS_NOT_SUPPORTED when calling `cublasCgemm( handle, opa, opb, m, n, k, reinterpret_cast<const cuComplex*>(&alpha), reinterpret_cast<const cuComplex*>(a), lda, reinterpret_cast<const cuComplex*>(b), ldb, reinterpret_cast<const cuComplex*>(&beta), reinterpret_cast<cuComplex*>(c), ldc)`
A and H are both ComplexFloatTensor tensors.
The above error starts occurring when A and H are of shape torch.Size([100, 54, 10]) and torch.Size([54, 512, 512]) or larger, but doesn't occur when they are of shape torch.Size([100, 44, 10]) and torch.Size([44, 256, 256])
Don't worry too much about the exact numbers, but the point is that it always works on CPU (just very slowly), but on GPU it breaks past a certain size.
Does anyone know what the problem could be? Given the edit below, it might just be caused by the fact that the GPU ran out of memory (but the error failed to tell me so)
EDIT: I ran the same thing on Google Colab and got the following error at the same place:
RuntimeError: CUDA out of memory. Tried to allocate 570.00 MiB (GPU 0; 14.76 GiB total capacity; 12.19 GiB already allocated; 79.75 MiB free; 13.38 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Google Colab uses Tesla T4 GPUs, while my server uses NVIDIA RTX A6000
Upvotes: 1
Views: 875
Reputation: 63
I figured out the answer to this question myself in the meantime. As it turns out, my GPU simply ran out of memory.
For some reason, Google Colab showed this error correctly (see above), while my own GPU showed this weird CUBLAS_STATUS_NOT_SUPPORTED error, instead of directly telling me that it is a memory issue.
Upvotes: 1