Reputation: 145
I have a buffer I map to vertex attributes to send. Here is the basic functionality of the code:
glBindBuffer(GL_ARRAY_BUFFER, _bufferID);
_buffer = (VertexData*)glMapBuffer(GL_ARRAY_BUFFER, GL_WRITE_ONLY);
for(Renderable* renderable : renderables){
const glm::vec3& size = renderable->getSize();
const glm::vec3& position = renderable->getPosition();
const glm::vec4& color = renderable->getColor();
const glm::mat4& modelMatrix = renderable->getModelMatrix();
glm::vec3 vertexNormal = glm::vec3(0, 1, 0);
_buffer->position = glm::vec3(modelMatrix * glm::vec4(position.x, position.y, position.z, 1));
_buffer->color = color;
_buffer->texCoords = glm::vec2(0, 0);
_buffer->normal = vertexNormal;
_buffer++;
}
and then I draw all renderables in one draw call. I am curious as to why touching the _buffer
variable at all causes massive slow down in the program. For example, if I call std::cout << _buffer->position.x;
every frame, my fps tanks to about 1/4th of what it usually is.
What I want to know is why it does this. The reason I want to know is because I want to be able to give translate objects in the batch when they are moved. Essentially, I want the buffer to always be in the same spot and not change but I can change it without huge sacrifices to performance. I assume this isn't possible but I would like to know why. Here is an example of what I would want to do if this didn't cause massive issues:
if(renderables.at(index)->hasChangedPosition())
_buffer+=index;
_buffer->position = renderables.at(index)->getPosition();
I am aware I can send the transforms through the shader uniform but you can't do that for batched objects in one draw call.
Upvotes: 2
Views: 532
Reputation: 52084
why touching the _buffer variable at all causes massive slow down in the program
...well, you did request a GL_WRITE_ONLY
buffer; it's entirely possible that the GL driver set up the memory pages backing the pointer returned by glMapBuffer()
with a custom fault handler that actually goes out to the GPU to fetch the requested bytes, which can be...not fast.
Whereas if you only write to the provided addresses the driver/OS doesn't have to do anything until the glUnmapBuffer()
call, at which point it can set up a nice, fast DMA transfer to blast the new buffer contents out to GPU memory in one go.
Upvotes: 4