user10986393
user10986393

Reputation:

Vulkan: Why have multiple command buffers per pool?

I am developing in Vulkan 1.0, building a rendering system by learning and implementing functionality one step at a time. I get the gist of command recording and submission, but I haven't been far enough to understand a use case in which I'd want to have multiple command buffers per pool. It was this presentation at slide 14 which raised some questions.

My understanding and current design is as follows:

From the assumptions above, in what cases would it be necessary or advantageous to have more than one command buffer per command pool as opposed to having one command buffer that records everything from beginning to end for the given frame and thread?

Upvotes: 2

Views: 2984

Answers (1)

Nicol Bolas
Nicol Bolas

Reputation: 474336

having one command buffer that records everything from beginning to end for the given frame and thread?

Well, what happens if a thread needs to record things in an order other than the order in which they need to be submitted? That's kind of the point of a CB, isn't it? The ability to build commands in an order that is convenient, then submit them in the way that works out for the GPU.

For example, let's say you have a thread that is rendering a particular set of objects. To do that, you need to write their matrices and other per-object properties to a uniform buffer. And let's say that, for whatever reason, this particular Vulkan implementation doesn't allow you to use mappable memory directly for uniform buffers. So you have to write to mappable memory and copy the data to a uniform buffer via a memory transfer operation.

So the thread creating the commands for these meshes need to do two things. They need to build the commands to render the meshes, and they need to build the commands to transfer the uniform data to the buffer that the rendering commands will need.

Your way however requires that commands are put into the CB in the order you want them executed. So you would have to loop through the entire list of objects to build the transfer commands, and loop through it again to build the rendering commands. But you're reading the same objects each time through the loop. During the first loop, you had access to 100% of the data needed to issue the rendering command.

And the second time through the loop, all that data is no longer in the cache. So the second time has about the same number of cache misses (and therefore real memory accesses) as the first time.

That's bad.

Furthermore, rendering commands need to be placed within a render pass instance. Transfer commands cannot be in a render pass instance. But if you're putting transfer commands into the same CB as the rendering commands... that CB must begin and end the render pass instance.

So... how can other threads issue commands for that render pass instance?

If you want parallelism (and you do), then you need these threads to be creating secondary CBs for their rendering commands. A later task will collate them into the primary CB, and that CB will have the render pass instance. But secondary CBs built for a render pass cannot contain transfer commands.

So if you want parallelism, then any transfer commands that have to be generated alongside rendering commands must go into a different CB. One that will be submitted before the secondary CBs (or even submitted to a different queue altogether).

Upvotes: 6

Related Questions