Reputation: 13
I'm trying to render 3 quads (1 bg, 2 sprites) using OpenGL. I have the following code:
void GLRenderer::onDrawObjects(long p_dt)
{
float _log_last_time = ELAPSED_MS;
fprintf(stdout, "-- starting draw: %.4f, dTime: %.4f\n", ELAPSED_MS, ELAPSED_MS - _log_last_time);
glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);
for(std::vector<queuedRenderable*>::iterator itr = m_displayList.begin();
itr != m_displayList.end();
itr++)
{
Sprite* spriteToRender = (*itr)->m_sprite;
float _log_last_time1 = ELAPSED_MS;
fprintf(stdout, "---- profiling object: %.4f, dTime: %.4f\n", ELAPSED_MS, ELAPSED_MS - _log_last_time1);
glMatrixMode( GL_MODELVIEW );
glLoadIdentity();
// glTranslate, glRotate, glScale goes here
fprintf(stdout, "---- transform done: %.4f, dTime: %.4f\n", ELAPSED_MS, ELAPSED_MS - _log_last_time1);
_log_last_time1 = ELAPSED_MS;
glEnable(GL_TEXTURE_2D);
glBindTexture(GL_TEXTURE_2D, (*itr)->m_material->m_glId);
fprintf(stdout, "---- set material done: %.4f, dTime: %.4f\n", ELAPSED_MS, ELAPSED_MS - _log_last_time1);
_log_last_time1 = ELAPSED_MS;
unsigned listIds[] = {
(*itr)->m_mesh->m_glId
};
glCallLists(1, GL_UNSIGNED_INT, listIds);
fprintf(stdout, "---- draw mesh done: %.4f, dTime: %.4f\n", ELAPSED_MS, ELAPSED_MS - _log_last_time1);
_log_last_time1 = ELAPSED_MS;
}
fprintf(stdout, "-- flushing to card: %.4f, dTime: %.4f\n", ELAPSED_MS, ELAPSED_MS - _log_last_time);
_log_last_time = ELAPSED_MS;
glFlush();
glFinish();
fprintf(stdout, "-- flush done: %.4f, dTime: %.4f\n", ELAPSED_MS, ELAPSED_MS - _log_last_time);
_log_last_time = ELAPSED_MS;
}
And I'm getting the following results in console:
-- starting draw: 24000.0000, dTime: 0.0000
---- profiling object: 24014.0000, dTime: 0.0000
---- transform done: 24033.0000, dTime: 19.0000
---- set material done: 24046.0000, dTime: 0.0000
---- draw mesh done: 24066.0000, dTime: 1.0000
---- profiling object: 24084.0000, dTime: 0.0000
---- transform done: 24102.0000, dTime: 18.0000
---- set material done: 24120.0000, dTime: 0.0000
---- draw mesh done: 24305.0000, dTime: 164.0000
---- profiling object: 24319.0000, dTime: 0.0000
---- transform done: 24338.0000, dTime: 19.0000
---- set material done: 24356.0000, dTime: 0.0000
---- draw mesh done: 24375.0000, dTime: 2.0000
-- flushing to card: 24389.0000, dTime: 389.0000
-- flush done: 24424.0000, dTime: 18.0000
As you can see, the drawing of the 2nd quad is taking very long (164ms). I have no idea why is that as they are all quads. I've also checked the texture (if that is relevant), but it is a lot smaller than the bg (1st quad). If it is of any relevance, I think the first quad should draw longer, not the second.
I'm fairly new to OpenGL (I have done a lot of opengl in the past, but just beginner stuff) so I have some idea of what I'm doing. I'm just not sure how I can optimize this further to remove that 164ms process.
How can I remove or minimize that 164ms process?
I have to note that I am using an Intel Atom with intel GMA 450 machine.
Upvotes: 1
Views: 462
Reputation: 473437
Let's assume that this mysterious ELAPSED_MS
macro were a cycle-accurate CPU counter. I rather doubt that it is, since it seems to jump in large millisecond increments, but let's pretend that this is the case.
Your methodology for profiling OpenGL is completely wrong. You are not measuring the time it takes to render a quad. You are measuring the time it takes for glFinish
to complete. There's no guarantee that glFinish
will return the moment the quad finishes rendering. Indeed, it could be waiting quite a while, possibly for a thread task switch or something. Who knows.
The point is that your methods for profiling are not reliable. Please stop using them. The only accurate ways of getting profiling data for actual rendering operations in OpenGL are to either employ timer queries (which are specifically designed to test how long it takes operations on the GPU to complete) or to use proprietary extensions.
Micro-profiling (like profiling the time it takes to draw a quad) is very difficult to do right. Quad drawing in particular is generally worthless to bother profiling.
ELAPSED_MS is ((((float)clock()) / CLOCKS_PER_SEC) * 1000.0f)
That's so not helping. Because even if your methods were good, clock
is not a reliable method for doing any form of timing. You have to use either platform-specific calls, or a platform neutral library like C++11's chrono (or the Boost equivalent).
Upvotes: 5