Reputation: 15110
Simplified problem I have two host threads, each with its own command queue to the same GPU device. Both queues are out-of-order with the execution order explicitly managed using wait events (simplified example doesn't need this, but actual application does).
ThreadA
is a lightweight processing pipeline that runs in real-time as new data is acquired. ThreadB
is a heavyweight slower processing pipeline that uses the same input data but processes it asynchronously at a slower rate. I'm using a double buffer to keep the pipelines separate but allow ThreadB
to work on the same input data written to device by ThreadA
.
ThreadA
's loop:
cl_mem BufferA
using clEnqueueWriteBuffer(CommandQueueA)
KernelA
using clEnqueueNDRangeKernel(CommandQueueA)
once write is complete (kernel outputs results to cl_mem OutputA
)OutputA
using clEnqueueReadBuffer(CommandQueueA)
ThreadB
's loop
BufferA
to BufferB
using clEnqueueCopyBuffer(CommandQueueB)
(double buffer swap)KernelB
using clEnqueueNDRangeKernel(CommandQueueB)
once copy is complete (kernel outputs results to cl_mem OutputB
)OutputB
using clEnqueueReadBuffer(CommandQueueB)
My Questions
There's a potential race condition between ThreadA
's step 2 and ThreadB
's step 2. I don't care which is executed first, I just want to make sure I don't copy BufferA
to BufferB
while BufferA
is being written to.
ThreadB
step 2 use clEnqueueCopyBuffer(CommandQueueA)
so that both the write and copy operations are in the same command queue, does that guarantee that they can't run simultaneously even though the queue allows out-of-order execution?ThreadA
to the waitlist of the CopyBuffer command in ThreadB
?It seems like any of these should work, but I can't find where in the OpenCL spec it says this is fine. Please cite the OpenCL spec in your answers if possible.
Upvotes: 0
Views: 232
Reputation: 699
Does OpenCL provide any implicit guarantees that this won't happen?
No, there is no implicit synchronization unless you use a single in-order command queue.
If not, if I instead on ThreadB step 2 use clEnqueueCopyBuffer(CommandQueueA) so that both the write and copy operations are in the same command queue, does that guarantee that they can't run simultaneously even though the queue allows out-of-order execution?
No, regardless of a queue's type (in-order vs out-of-order), OpenCL runtime does not track memory dependencies of commands. User is responsible to specify events in a wait list, if any dependency between commands exists.
The following quote could serve as a proof of that:
s3.2.1 Execution Model: Context and Command Queues
Out-of-order Execution: Commands are issued in order, but do not wait to complete before following commands execute. Any order constraints are enforced by the programmer through explicit synchronization commands.
It is not a direct answer to your question, but I assume that if any guarantees were provided, they should be mentioned in this section.
If not, is there a better solution than adding the WriteBuffer's event in ThreadA to the waitlist of the CopyBuffer command in ThreadB?
If you can use a single in-order queue, that would probably be more efficient than a cross-queue event, at least for some implementations.
Upvotes: 2