Achimnol
Achimnol

Reputation: 1599

cudaSetDevice() allocates more than 580 MB of global memory

I have a sophisticated CUDA-based Linux application. It runs on an i7 machine with one NVIDIA GTX 560 Ti card (1 GB memory), using Ubuntu 12.04 (x86_64) and NVIDIA driver 295.41 + CUDA 4.2 Toolkit.

The application requires about 600-700 MB of global memory in GPU, and it fails to run due to "out of memory" error on calls to cudaMalloc().

After some debugging, I found that the first call to cudaSetDevice() at the very beginning of the application allocates about 580 MB of global memory at once, and the available memory for the rest of application is only 433 MB.

The CUDA reference manual says that it initializes a "primary context" for the device and allocates various resources such as CUDA kernels (called "module" in the driver API) and constant variables. The application has some __device__ __constant__ variables but the total amount of them is just a few KB. There are about 20-30 kernels and device functions.

I have no idea why CUDA allocates such a large amount of GPU memory during initialization. In a separate minimal program that do only cudaSetDevice(0); cudaMemGetInfo(&a, &t); printf("%ld, %ld\n", a, t); shows about 980 MB of available memory. So the problem should reside at my application, but I could not figure out what causes such large memory allocation because the implementation detail of cudaSetDevice() is completely proprietary.

Could I get some other ideas?

Upvotes: 3

Views: 2751

Answers (3)

TripleS
TripleS

Reputation: 1246

I presume that cudaSetDevice is the 1st CUDA call you are doing in your application, therefore as a CUDA developer you should know that 1st CUDA call is very expensive because CUDA 1st allocates its components on the graphic card, which is around 500 MB.

Try starting your program using another CUDA command, e.g. cudaMalloc, you'll experience that same amount of allocation by CUDA. You can also run deviceQuery under the CUDA Samples to see how much memory is in use.

Upvotes: 3

mchen
mchen

Reputation: 10136

I had a similar problem when the first call to any cudaXXX() function caused the reported VmData (UNIX) to spike massively, sometimes to tens of GB. This is not a bug and the reason is given here:

Why does the Cuda runtime reserve 80 GiB virtual memory upon initialization?

Upvotes: -1

sjiagc
sjiagc

Reputation: 56

It sounds like an issue, would you like to file a bug to Nvidia? The step are: 1. Open page http://developer.nvidia.com/cuda/join-cuda-registered-developer-program; 2. If not registered, please click "Join Now", otherwise click "Login Now"; 3. Input e-mail and password to login; 4. On the left panel, there is a "Bug Report" item in Home section, click it to file a bug; 5. Fill the required itmes, other items are optional, but detailed information will help us to target and fix the issue a lot; 6. If necessary, an attachment should be uploaded; 7. For Linux system, it is better to attach an nvidia-bug-report; 8. If an issue is related to specific code pattern, a sample code and instructions to compile it are desired for reproduction.

Upvotes: 1

Related Questions