Reputation: 1
I recently wanted to work on a compute shader for OpenGL. In this experiment, I wanted to access one of the color textures attached to a FrameBufferObject. When attempting to pass the texture to the compute shader with a layout(rgba32f) readonly image2D, nothing was passed in. I rewrote the compute shader to use a sampler2D instead. The sampler worked just fine.
I also tested the gimage2D compute shader with just a texture, that wasn't attached to anything. This also worked as expected.
I haven't found any documentation stating that a texture attached to an FBO can't be accessed in a compute shader using gimage2D. I also haven't found any documentation stating that a compute shader can't write to an FBO.
I guess my question is why can't a texture, attached to an FBO, be accessed, in a compute shader, using gimage2D? Is there documentation explaining this?
Upvotes: 0
Views: 1190
Reputation: 43319
"I guess my question is why can't a texture, attached to an FBO, be accessed, in a compute shader, using gimage2D?"
You don't use gimage2D
, if you see a type prefixed with g
in GLSL documentation it is a generic type. (e.g. gvec<N>
, gsampler...
, etc.) It means that the function has overloads for every kind of vec<N>
or sampler...
. In this case, gimage2D
is the short way of saying "... this function accepts image2D
, iimage2D
or uimage2D
".
There is no actual gimage2D
type, the g
prefix was invented solely for the purpose of keeping GLSL documentation short and readable ;)
I think you already know this, because the only actual code listed in the question is using image2D
, but the way things were written I was not sure.
Pay special attention to: GL_FRAMEBUFFER_BARRIER_BIT
.
Compute Shaders are scheduled separately from stages of the render pipeline; they have their own single-stage pipeline. This means that if you draw something into an FBO attachment, your computer shader may run before you even start drawing or the compute shader may use an (invalid) cached view of the data because the change made in the render pipeline was not visible to the compute pipeline. Memory barriers will help to synchronize the render pipeline and compute pipeline for resources that are shared between both.
The render pipeline has a lot of implicit synchronization and multi-stage data flow that gives a pretty straightforward sequential ordering for shaders (e.g. glDraw*
initiates vertex->geometry->fragment), but the compute pipeline does away with virtually all of this in favor of explicit synchronization. There are all sorts of hazards that you need to consider with compute shaders and image load/store that you do not with traditional vertex/geometry/tessellation/fragment.
In other words, while declaring something coherent
in a compute shader together with an appropriate barrier at the shader level will take care of synchronization between compute shader invocations, since the compute pipeline is separate from the render pipeline it does nothing to synchronize image load/store between a compute shader and a fragment shader. For that, you need glMemoryBarrier (...)
to synchronize access to the memory resource at the command level. glDraw* (...)
(entry-point for the render pipeline) is a separate command from glDispatch* (...)
(entry-point for the compute pipeline) and you need to ensure these separate commands are ordered properly for image load/store to exhibit consistent behavior.
Without a memory barrier, there is no guarantee about the order commands are executed in; only that they produce results consistent with the order you issued them. In the render pipeline, which has strictly defined input/output for each shader stage, GL implementations can intelligently re-order commands while maintaining this property with relative ease. With compute shaders as well as image load/store in general, where the I/O is completely determined by run-time flow it is impossible without some help (memory barriers).
TL;DR: The reason why it works if you use a sampler and not image load/store comes down to coherency guarantees (or the lack thereof). Image load/store simply does not guarantee that reads from an image are coherent (strictly ordered) with respect anything that writes to an image, and instead requires you to explicitly synchronize access to the image. This is actually beneficial as it allows you to simultaneously read/write the same image without leading to undefined behavior, but it requires some extra effort on your part to make it work.
Upvotes: 3