sooon
sooon

Reputation: 4888

Metal - Threads and ThreadGroups

I am learning Metal right now and trying to understand the lines below:

let threadGroupCount = MTLSizeMake(8, 8, 1) ///line 1
let threadGroups = MTLSizeMake(drawable.texture.width / threadGroupCount.width, drawable.texture.height / threadGroupCount.height, 1) ///line 2

command_encoder.dispatchThreadgroups(threadGroups, threadsPerThreadgroup: threadGroupCount) ///line 3
  1. for the line 1, What is the 3 integers represent? My guess is to assign the number of threads to be used in the process but which is which?

  2. What is the different between line 1 and 'line 2'? My guess again is the different between threads and thread groups. But I am not sure what is the fundamental difference and when to use what.

Upvotes: 4

Views: 2107

Answers (2)

warrenm
warrenm

Reputation: 31782

When dispatching a grid of work items to a compute kernel, it is your responsibility to divide up the grid into subsets called threadgroups, each of which has a total number of threads (width * height * depth) that is less than the maxTotalThreadsPerThreadgroup of the corresponding compute pipeline state.

The threadsPerThreadgroup size indicates the "shape" of each subset of the grid (i.e. the number of threads in each grid dimension). The threadgroupsPerGrid parameter indicates how many threadgroups make up the entire grid. As in your code, it is often the dimensions of a texture divided by the dimensions of your threadgroup size you've chosen.

One performance note: each compute pipeline state has a threadExecutionWidth value that indicates how many threads of a threadgroup will be scheduled and executed together by the GPU. The optimal threadgroup size will thus always be a multiple of threadExecutionWidth. During development, it's perfectly acceptable to just dispatch a small square grid as you're currently doing.

Upvotes: 12

gpu3d
gpu3d

Reputation: 1184

The first line gives you the number of threads per group (in this case two-dimensional 8x8), while the second line gives you the number of groups per grid. Then the dispatchThreadgroups(_:threadsPerThreadgroup:) function on the third line uses these two numbers. The number of groups can be omitted in which case it defaults to using one group.

Upvotes: 1

Related Questions