mehdi_bm
mehdi_bm

Reputation: 409

Synchronising multiple devices in cuda

In the manual of CUDA, in the explaination of cudaStreamSynchronize(stream), it mentioned that

Blocks until stream has completed all operations. If the cudaDeviceScheduleBlockingSync flag was set for this device, the host thread will block until the stream is finished with all of its tasks.

My question is this barrier blocks the host (i.e. all the devices in multigpu) to all the previously issued operations within the stream finish. Am I right?

And what about cudaDeviceSynchronize() in multi-gpu task? It blocks all the devices to finish all the tasks issued on a device set by cudaSetDevice(deviceid) or it blocks host to all the operations previously issued in all the devices finish?

Upvotes: 0

Views: 1125

Answers (1)

mehdi_bm
mehdi_bm

Reputation: 409

I found the answer of my questions and I mention here for the one who might face the same problem. I quote it from programming guide of cuda

cudaDeviceSynchronize() waits until all preceding commands in all streams of all host threads have completed.

cudaStreamSynchronize() takes a stream as a parameter and waits until all preceding commands in the given stream have completed. It can be used to synchronize the host with a specific stream, allowing other streams to continue executing on the device.

cudaStreamWaitEvent() takes a stream and an event as parameters (see Events for a description of events)and makes all the commands added to the given stream after the call to cudaStreamWaitEvent() delay their execution until the given event has completed.

cudaStreamQuery() provides applications with a way to know if all preceding commands in a stream have completed.

Upvotes: 1

Related Questions