user1118148
user1118148

Reputation: 193

cuda: need of synchronization for reading device memory variable

I am running an iterative program in cuda, which runs till convergence. As said in this SO post (Are cuda kernel calls synchronous or asynchronous), from point of view of CPU, cuda kernels are asynchronous.

In my program, one of the kernel checks for convergence and returns the boolean value to the host to read. I wanted to know, whether I need to do

cudaDeviceSynchronize()

before reading the boolean value?

Upvotes: 1

Views: 700

Answers (1)

scatman
scatman

Reputation: 14565

It depends how are you returning the Boolean value back to the CPU. are you using cudaMemcpy? if yes then you don't have to use cudaDeviceSynchronize(), since cudaMemcpy will block until the kernel finishes execution and then copies data from GPU to CPU.

Upvotes: 5

Related Questions