OpenGL ES2: Using FrameBuffer objects to render many shapes more quickly

Question

I've got a program that draws upwards of 90 2D shapes w/ textures which the user can pick up and drag by touching the screen. There is a noticeable amount of choppiness, and DDMS tells me that the one method that takes up the most CPU time (~85%) is the draw() method. Since only 1 shape is actually moving and the other 89 are not, would it be possible/faster to render the 89 shapes to a texture using a FrameBuffer object and draw that texture on a shape that fills up the whole screen? If not, are there any other potential ways of speeding things up?

private void draw() {
    // Pass in the position information
    mCubePositions.position(0);
    GLES20.glVertexAttribPointer(mPositionHandle, mPositionDataSize, GLES20.GL_FLOAT, false, 0, mCubePositions);
    GLES20.glEnableVertexAttribArray(mPositionHandle);

    // Pass in the color information
    mCubeColors.position(0);
    GLES20.glVertexAttribPointer(mColorHandle, mColorDataSize, GLES20.GL_FLOAT, false, 0, mCubeColors);
    GLES20.glEnableVertexAttribArray(mColorHandle);

    // Pass in the texture coordinate information
    mCubeTextureCoordinates.position(0);
    GLES20.glVertexAttribPointer(mTextureCoordinateHandle, mTextureCoordinateDataSize, GLES20.GL_FLOAT, false, 0, mCubeTextureCoordinates);
    GLES20.glEnableVertexAttribArray(mTextureCoordinateHandle);

    // This multiplies the view matrix by the model matrix, and stores the
    // result in the MVP matrix
    // (which currently contains model * view).
    Matrix.multiplyMM(mMVPMatrix, 0, mViewMatrix, 0, mModelMatrix, 0);

    // Pass in the modelview matrix.
    GLES20.glUniformMatrix4fv(mMVMatrixHandle, 1, false, mMVPMatrix, 0);

    // This multiplies the modelview matrix by the projection matrix, and
    // stores the result in the MVP matrix
    // (which now contains model * view * projection).
    Matrix.multiplyMM(mMVPMatrix, 0, mProjectionMatrix, 0, mMVPMatrix, 0);

    // Pass in the combined matrix.
    GLES20.glUniformMatrix4fv(mMVPMatrixHandle, 1, false, mMVPMatrix, 0);

    // Draw the cube.
    GLES20.glDrawArrays(GLES20.GL_TRIANGLES, 0, 6);
}

Thanks in advance.

sleep · Accepted Answer

I got confused by the question referring to "cubes" when it meant quads, so this answer deals with the 3d case, which is probably more instructive anyway.

Combine the view and projection matrices into a ViewProj matrix. Then in the vert shader you do VertexPos * Model * ViewProj.

Also you really need to batch. You should have a single big array with all your cubes in it, and another array with the transforms for each cube. Then you do a single draw call for all cubes. Consider converting it to use a Vertex Buffer Object. Draw calls are CPU intensive because they invoke a whole bunch of logic and memory copying etc. in the API behind the scenes. Game engines go to great lengths to minimise them.

How to make one draw call draw many things

Put all the different textures into a single texture (an "atlas"), and compensate by adjusting the UVs of each cube to look up the appropriate portion of the texture. Put all your model matrices into a contiguous array, and index into this array in your vertex shader e.g.

attribute vec3 a_position;
attribute vec2 a_texCoord;
attribute int  a_modelIndex;
attribute int  a_UVlIndex;

uniform   mat4   u_model[90];
uniform   vec2   u_UVOffset[16];   // Support 16 different textures in our atlas.

varying   vec2   v_texCoord;
...

void main()
{
    gl_Position = u_viewProj * u_model[a_modelIndex] * vec4(a_position, 1);
    v_texCoord  = a_texCoord + u_UVOffset[a_UVlIndex];
    ...
}

You can pack all your vertex data into one big array, so you end up doing GLES20.glDrawArrays(GLES20.GL_TRIANGLES, 0, 6 * 90); But even better, since you are just drawing cubes all the time, you can re-use the exact same vertex data each time. The model matrices take care of the rest (scale, rotation, translation). To do this, use glDrawElements instead of glDrawArrays, and --- assuming tri lists for simplicity --- specify 36 indices that reference the 36 vertices in your vertex array that make a cube, then just repeat those 36 indices 90 times to make your index array. The vertices should be a unit cube, centered on (0, 0, 0). This same "cube template" then gets modified by the model matrix in the vertex shader to create each visible "cube instance". The only thing you need to change each frame are the model matrices and the texture UVs.

glVertexAttribPointer() allows you to spew pretty much anything you like into your vertex shader, and it may be more efficient to have the model matrices as attributes rather than uniforms with some creative use of glVertexAttribPointer.

Mobile devices tend to be quite sensitive to being pixel bound. If you're cubes are quite large on the screen, you might be getting a lot of overdraw. The high CPU % (it is just a percentage after all) could be a red herring, and you may be pixel bound on the GPU. A simple test for this is to make all your cubes very small and see if the framerate improves.

For reference, the S5570 has an Adreno 200 GPU.

OpenGL ES2: Using FrameBuffer objects to render many shapes more quickly

Answers (1)

Related Questions