Jitendra
Jitendra

Reputation: 303

Does CPU waits for DEVICE to let it finish its kernel execution....?

Does host wait for device to finish its execution compeletely? e.g. the program has the structure as follows

// cpu code segment

// data transfer from host to device

QUESTION - WILL CPU WAIT FOR DEVICE TO FINISH TRANSFER? IF NO, IS IT POSSIBLE? IF YES, HOW?

// kernel launch

QUESTION - WILL CPU WAIT FOR DEVICE TO LET IT FINISH KERNEL EXECUTION (CONSIDERING KERNEL EXECUTION WILL TAKE NOTABLE TIME say-5 sec)? IF NO, IS IT POSSIBLE? IF YES, HOW?

// data transfer from device to host

// program terminates after printing some information 

Upvotes: 18

Views: 21465

Answers (1)

sgarizvi
sgarizvi

Reputation: 16816

The synchronization functions of the CUDA run-time can let you achieve what you want.

cudaDeviceSynchronize():

When you call this function, the CPU will wait until the device has completed ALL its work, whether it is memory copy or kernel execution.

cudaStreamSynchronize(cudaStream):

This function will block the CPU until the specified CUDA stream has finished its execution. Other CUDA streams will continue their execution asynchronously.

Upvotes: 32

Related Questions