Reputation: 31
Is there any way to fire an event (for benchmarking purposes, similar to cudaEvents in the CPU code) from a device kernel in CUDA?
E.g. suppose I would like to measure the time passed from kernel start to the first thread ever that starts a computation and the time passed from the last thread that leaves the computation to the CPU return.
Can I do that?
Upvotes: 2
Views: 775
Reputation: 132220
An ugly workaround would be writing to some managed-memory location, and having a host-side thread poll it and fire the event when the value changes.
Upvotes: 1
Reputation: 72353
The device runtime API (used with dynamic parallelism) does have limited stream and events support, but event timing is not supported.
So, no you can't do that.
Upvotes: 2