Deepak Sharma
Deepak Sharma

Reputation: 6585

Metal multiple compute shaders vs single

I am running multiple (4 or 5) compute shaders that process the same data and give different outputs. User may however enable one, some, or all of them. I have two choices from performance consideration:

  1. Merge all those compute shaders into one and calculate everything in a single pass. Then based on user input, selectively display the data. This needs a single pass but the number of parameters to compute shader might increase (upto 8 MTLBuffers),

  2. Split them into multiple shaders and use multiple passes to compute each and every piece of data. Each pass uses a different compute command encoder.

Are multiple passes where the data already resides in GPU bad from performance perspective? Which option is recommended from performance consideration?

Upvotes: 0

Views: 989

Answers (1)

Devin Lane
Devin Lane

Reputation: 1004

I would expect option 2 to perform just as well, unless there is significant overlap in the calculations performed by each shader (ie, shared temporaries.) The overhead of the command buffers is pretty negligible.

You can profile this using Instruments and the "Metal System Trace" template. It'll tell you how long each kernel spends executing and the gaps between them (where memory copy, command buffer queuing etc. happens). If the profile for option 2 shows a ton of gaps where the GPU is not being used, then I'm wrong and maybe you need to do less passes :)

Upvotes: 1

Related Questions