Reputation: 1103
I am confused on CUDA streams. I've learned that cudaStreamSynchronize()
waits until the GPU operations are completed for the particular streams. And for a stream if we've called 2 kernels, the kernel will be executed sequentially; that is for a stream the first kernel will be executed and the next is going to be executed only after the first is completed.
What I want to ask is that if we have a single stream, is it necessary to synchronize streams? Doesn't it execute sequentially whether we synchronize it or not for a single stream?
Upvotes: 3
Views: 2211
Reputation: 152164
Yes, cuda calls issued to the same stream (the default stream or any stream) are executed sequentially. They are serialized.
You might still issue a synchronize command into that stream for some specific cases where you wanted to wait for GPU activity to finish, before executing some CPU code. CPU code issued immediately after a kernel call or issued immediately after a cudaMemcpyAsync
for example, would normally execute concurrently with the preceding (cuda) call.
One specific case might be for cuda error checking. Another specific case might be if you had some CPU/GPU data exchange going on asynchronously in zero-copy pinned memory (where there would be no need to issue an explicit cudaMemcpy...
call).
But when issued to the same stream, there is usually no need to explicitly synchronize cuda calls that are of the usual cudaMemcpyAsync...kernel call...cudaMemcpyAsync
pattern. The stream will do that for you.
Upvotes: 6