Alvin
Alvin

Reputation: 950

Number of created blocks/threads and of occupied memory when CUDA cufftExecC2C is invoked

I'm using cuFFT functions in my program. I'm using Tesla k20 card. My signal size is 16384.

How many number of blocks and threads will be created and how much memory will be consumed on the GPU when cufftExecC2C is called?

Upvotes: 0

Views: 486

Answers (1)

Robert Crovella
Robert Crovella

Reputation: 152164

As @harrism indicated, you can use nvprof to discover the execution parameters.

nvprof --print-gpu-trace <your-executable>

For the memory, you could use an observational method as well, such as using nvidia-smi to query GPU memory usage while your application is running, or use one of the CUDA API calls like cudaMemGetInfo to query memory while your FFT is running.

In CUDA 5.5, a new set of CUFFT API calls are introduced to help with estimating memory needs as well. The relevant API calls are:

cufftEstimate1d(…)
cufftEstimate2d(…)
cufftEstimate3d(…)
cufftEstimateMany(…)

These calls will return an estimated memory usage size for the proposed transform type and size.

Refer to the CUDA 5.5 RC documentation in (e.g. for a linux cuda 5.5 RC install):

/usr/local/cuda/doc/pdf/CUFFT_Library.pdf

In particular section 3.4 "CUFFT Estimated Size of Work Area"

You can get a more accurate estimate of the size if you have a plan developed, using the following CUDA 5.5 CUFFT API cufftGetSize... calls analogous to the estimate calls. Refer to section 3.5 of the aforementioned doc for details.

Upvotes: 3

Related Questions