bweber
bweber

Reputation: 4082

Determine maximum amount of GPU device memory that can be allocated contiguously

I am currently working on a CUDA application that will use as much global device memory (VRAM) as is available if the processed data is sufficiently large. What I am allocating is a 3D volume using cudaMalloc3d, so the memory I use must be contiguous. For this purpose I tried retrieving the amount of free device memory by using the function cudaMemGetInfo and then allocating as much as is free. However, this does not seem to work. I still get errors when trying to allocate that amount of memory.

Now, my question is whether there is a way to retrieve the maximum amount of device memory that I can allocate contiguously.

One option would be a trial-and-error approach where I iteratively decrease the amount I try to allocate until allocation succeeds. However, I don't like this idea very much.

Background: I have a program that does cone-beam CT reconstruction on the GPU. Those volumes can become quite large so I split them into chunks when necessary. Therefore I have to know how large a chunk can at most be to still fit into global device memory.

Upvotes: 5

Views: 1920

Answers (1)

Robert Crovella
Robert Crovella

Reputation: 151799

Now, my question is if there is a way to retrieve the maximum amount of device memory that I can allocate contiguously.

There is not.

With a bit of trial and error, you can come up with an estimated maximum, say 80% of the available memory reported by cudaMemGetInfo(), and use that.

The situation with cudaMalloc is generally similar to a host-side allocator, e.g. malloc. If you queried the host operating system for the available memory, then tried to allocate all of it in a single malloc call, it would likely fail.

Upvotes: 7

Related Questions