Zk1001
Zk1001

Reputation: 2053

Number of active warps in GPU (Fermi)

I have a quick question about the active warps in GPU (I would prefer to know it in Fermi). For specific kernel, is the number of active warps at any cycle in a SM the same for the whole execution time of the kernel? As I experimented, there is some correlation between the total number of active warps (for the whole execution) and the number of synchronizations in the program kernel. Can anyone clarify this relation? Thanks

Upvotes: 0

Views: 811

Answers (2)

veda
veda

Reputation: 6584

The relationship between the barrier synchronization and wrap is explained in this paper, Demystifying GPU Microarchitecture through Microbenchmarking.

Upvotes: 0

Tom
Tom

Reputation: 21108

The number of active warps can vary over time since:

  • Other threadblocks can complete or begin on the same SM, so if you have four warps per threadblock then if only one threadblock is resident on the SM you would have up to four warps, but with two or three threadblocks you would have up to eight or twelve resp.
  • If a warp reaches the end of their code then it will no longer be executing code (naturally)

The active warps count for a whole program execution would depend on a number of factors, but remember that it is incremented by the number of active warps on each cycle. This means if you increase the number of syncs, which would also increase the number of cycles each warp requires to execute the kernel, then you would expect a higher active warps count.

Also note that some derived statistics in the profiler are approximate since they often use values from more than one run, hence there can be some variability.

Upvotes: 3

Related Questions