Reputation: 157
Let's say a SM has been populated with 8 blocks of 64 threads each.
That gives us 2 warps/block, and 16 warps in total. SMs can alternate between warps in order to hide latencies. Must these warps belong to the same block, or can a warp from block 5 be replaced by a warp from block 8, for example?
Upvotes: 0
Views: 179
Reputation: 151809
Yes, the SM scheduler can "alternate" or choose warps for scheduling from any that are resident on that SM.
The fact that SMs have a max possible warp load (64, currently, for some GPUs) or thread load (2048, currently, for some GPUs) that exceeds the possible limit of a single block (1024, currently, for all GPUs supported by recent CUDA toolkits) is so that the SM can choose warps from different blocks for scheduling, to improve the possibilities for latency hiding.
Upvotes: 3