erogol
erogol

Reputation: 13614

what happens when multiple kernels are sent to the device to be executed?

Suppose that I have send two consecutive kernel calls to the device. Does it wait to complete the first one or it executed them concurrently? If they are executed in parallel, do they intersect with each other for instance for memory access? What is the paradigm that is used for such case in CUDA?

Upvotes: 1

Views: 862

Answers (1)

harrism
harrism

Reputation: 27809

Two consecutive kernel launches to the same CUDA device will run concurrently if:

  1. They are launched from the same CUDA context.
  2. They are executed on different CUDA streams.
  3. The device supports concurrency (Compute 2.0 and later).
  4. There are sufficient resources (registers, shared memory, thread blocks) to support thread blocks from both kernels simultaneously.

For more information, see this section in the CUDA C Programming Guide.

As sgar91 commented, if these kernels share global memory, then it is the programmer's responsibility to write a correctly synchronized program to avoid race conditions. If the two kernels only read the same memory, then there can be no race condition.

Upvotes: 3

Related Questions