Reputation: 410
I'm facing a problem where the use of an occlusion query in combination with instanced rendering would be desirable.
As far as I understood, something like
glBeginQuery(GL_ANY_SAMPLES_PASSED, occlusionQuery);
glDrawArraysInstanced(mode, i, j, countInstances);
glEndQuery(GL_ANY_SAMPLES_PASSED);
will only tell me, if any of the instances were drawn. What I would need to know is, what set of instances has been drawn (giving me the IDs of all visible instances). Drawing each instance in an own call is no option for me.
An alternative would be to color-code the instances and detect the visible instances manually. But is there really no way to solve this problem with a query command and why would it not be possible?
Upvotes: 0
Views: 1263
Reputation: 473174
It's not possible for several reasons.
Query objects only contain a single counter value. What you want would require a separate sample passed count for each instance.
Even if query objects stored arrays of sample counts, you can issue more than one draw call in the begin/end scope of a query. So how would OpenGL know which part of which draw call belonged to which query value in the array? You can even change other state within the query scope; uniform bindings, programs, pretty much anything.
The samples-passed count is determined entirely by the rasterizer hardware on the GPU. And the rasterizer neither knows nor cares which instance generated a triangle.
Instancing is a function of the vertex processing and/or vertex specification stages; by the time the rasterizer sees it, that information is gone. Notice that fragment shaders don't even get an instance ID as input, unless you explicitly create one by passing it from your vertex processing stage(s).
However, if you truly want to do this you could use image load/store and its atomic operations. That is, pass the fragment shader the instance in question (as an int
data type, with flat
interpolation). This FS also uses a uimageBuffer
buffer texture, which uses the GL_R32UI
format (or you can use an SSBO unbounded array). It then performs an imageAtomicAdd
, using the instance value passed in as the index to the buffer. Oh, and you'll need to have the FS explicitly require early tests, so that samples which fail the fragment tests will not execute.
Then use a compute shader to build up a list of rendering commands for the instances which have non-zero values in the array. Then use an indirect rendering call to draw the results of this computation. Now obviously, you will need to properly synchronize access between these various operations. So you'll need to use appropriate glMemoryBarrier
calls between each one.
Even if queries worked the way you want them to, this would be overall far more preferable than using a query object. Unless you're reading a query into a buffer object, reading a query object requires a GPU/CPU synchronization of some form. Whereas the above requires some synchronization and barrier operations, but they're all on-GPU operations, rather than synchronizing with the CPU.
Upvotes: 5