Reputation: 51
I have an SSBO named sparseMatrix
and the following order of operations:
void callerFunc()
{
func1();
func2();
}
/* Clear buffer data store and fill with compute shader */
void func1()
{
glBindBufferBase(GL_SHADER_STORAGE_BUFFER, 1, sparseMatrix);
GLfloat floatZero = 0.0f;
glClearBufferSubData(GL_SHADER_STORAGE_BUFFER, GL_R32F, **EDIT: 0**, sizeof(GLfloat)*size, GL_RED, GL_FLOAT, &floatZero);
/* use shader program, bind uniforms */
glDispatchCompute(numWorkGroups,1,1); // fills buffer by adding a few numbers
}
/* Download data store contents and print */
void func2()
{
glBindBuffer(GL_SHADER_STORAGE_BUFFER, sparseMatrix);
GLfloat* temp = new GLfloat[size];
glGetBufferSubData(GL_SHADER_STORAGE_BUFFER, 0, sizeof(GLfloat)*size, temp);
/* print values to console */
}
There are no calls in between func1()
and func2()
.
The values that are printed to a console are garbage (every float is -107374176.000000
). I tested this on two machines, one with a GeForce GTX 570 and one with a GeForce GT 750M, with the exact same result, including the alterations below. Driver version is 335.23.
I tried making all of the following alterations to the code (every alteration separately):
func2()
to the end of func1()
, the values turn out fine.func2()
directly into callerFunc()
, the values turn out fine.glGetBufferSubData()
call on the SSBO at the end of func1()
, the values queries in func2()
turn out fine.glFinish()
after the glClearBuffer
call or at the end of func1()
, the values in func2()
are correct. If I place the glFinish()
at the beginning of func2()
though, it doesn't change anything.glMemoryBarrier(GL_SHADER_STORAGE_BARRIER_BIT)
anywhere doesn't help either.Does anyone have an explanation for this peculiar behavior?
EDIT: I replaced the calls to glClearBufferSubData(...)
with a computer shader that fills up the data store with a constant value and now the behavior is as expected. But I still don't know what caused the problem.
EDIT 2: Thank you for the answer, but actually I used it correctly. When I posted the code here I forgot to put in the offset parameter, sorry about that :( I encountered the problem again during another long list of consecutive compute dispatches. I tried many things and in the end it helped to put a GL_TEXTURE_FETCH_BARRIER_BIT memory barrier instead of the GL_SHADER_STORAGE_BARRIER_BIT barrier, although the compute shaders work purely on SSBOs. I have no idea why.
Upvotes: 2
Views: 542
Reputation: 8317
The error is how you are using
glClearBufferSubData
function.
Just look at specifications:
You basically providing no offset, I'm guessing why you didn't get any compiler error since you were missing a parameter
code sample:
GLfloat zeroFloat = 0.0f;
glClearBufferSubData(GL_SHADER_STORAGE_BUFFER, //target
GL_R32F, //internal format
0, //you were missing this: offset
sizeof(GLfloat)*size, //size
GL_RED, //format
GL_FLOAT, //type
&zeroFloat); //data
Edited answer to reflect request in comments:
GPU may use caches at different levels of the pipeline, so changes in a BufferObject are not immediatly visible from other stages of the pipeline. A memory barrier force coherency for specified targets so that every Write operation before the barrier will be visible after the barrier. If you have any Write operation after the barrier then you are in trouble.
write
Memory Barrier
read
You mentioned a long computation, then (assuming there are no drivers bug) is possible that the current content of the SSBO depends on a texture. So a memory barrier on the texture make sure the content of the SSBO has the texture data. Then appears that the memory barrier on SSBO is not necessary because when SSBO is accessed it already have the correct data (in that case you theorically need both Texture and SSBO bits setted: texture barrier before updating SSBO, and SSBO barrier before using it).
If you can reproduce the issue with small code, then it could be a driver bug. Remember to test the code on multiple machines with different hardware because you still could get unexpected results due to missing memory barriers.
Upvotes: 2