cplusplusrat
cplusplusrat

Reputation: 1445

CUDA and Graphics Kernels Order of Execution

I have a code which goes something like this.

1) Host: Launch Graphics Kernels 2) Host: Launch CUDA Kernels (all async calls) 3) Host: Do a bunch of number crunching on the host 4) Back to step 1

My questions is this. The CUDA API guarantees that the CUDA kernels even if they are async are executed in order of being launched. Does this apply to the rendering ? Lets say I have some rendering related calculations in progress on the GPU. If I launch async CUDA calls, Will they only be executed once the rendering is complete ? Or will these two operations overlap ?

Also, if i call a CUDA device synchronize after step 2, it certainly forces the device to complete CUDA related functions calls. What about rendering ? Does it stall the host until the rendering related operations are complete as well ?

Upvotes: 1

Views: 556

Answers (1)

stuhlo
stuhlo

Reputation: 1507

Calling CUDA kernels somehow locks GPU, therefore any other usage of GPU is not supported. Each process of host code has to execute device code in a specific context and the only one context can be active on a single device at a time.

Callig cudaDeviceSynchronize(); blocks the calling host code. After completing the execution of all streams of device code, control is returned to the calling host code.

EDIT: See this very comprehensive but somewhat out-of-date answer and you can study this paper to see what are capable of last devices. In short, launching CUDA kernel, or even calling cudaSetDevice() on a device that is being concurrently utilized by another thread crashes by throwing some error. If you would like to utilize your GPU by concurrent CUDA processes, there is a possibility (on linux-only machines) to use some kind of inter-layer (called MPS) between host threads and CUDA API calls. This is described in my second link.

Upvotes: 1

Related Questions