Draz
Draz

Reputation: 782

OpenGL ES shader huge performance problems on older phones

I am facing some performance issues on some phones with my lightshader. The shader calculates linelights and pointlights based on normalmaps. It is also indicating hits on targets (it's a 2D top-down shooter game).

I am developing on a Sony Xperia Z5 Compact, where everything works just fine. Then I tried it on my very old Samsung Galaxy S2, where it is really slow. I didn't care because of the age of the phone.

Now I tried it on a Galaxy S4 and a Galaxy S5, but it doesn't seem to run ANY faster than on the Galaxy S2. I also had huuuuge compilation times (around 90 seconds), but I managed to bring that down to like a few seconds by optimizing the code (I think it's still messy though, I am not really into shaders).

I really don't know whats the bottleneck here and why it isn't running any faster on those S4's and S5's.

This is my shader:

#ifdef GL_ES
#define LOWP lowp

precision highp float;

#else
    #define LOWP
#endif


#define MAX_NUM_POINTLIGHTS  16


uniform vec3 lpos[MAX_NUM_POINTLIGHTS]; // Light position
uniform vec3 foff[MAX_NUM_POINTLIGHTS]; // Light falloff
uniform vec4 acol[MAX_NUM_POINTLIGHTS]; // Ambient color
uniform vec4 lcol[MAX_NUM_POINTLIGHTS]; // Light color



//textures
uniform sampler2D u_texture1; // diffuse texture 
uniform sampler2D u_texture2; // normalmap


varying vec4 vColor;
varying vec2 vTexCoord;
varying float vFlags;


uniform vec2 Resolution;      //resolution of screen

const float WORLD_WIDTH = 1440.0;
const float WORLD_HEIGHT = 2560.0;

vec3 getPointLightColor(const vec4);
vec3 rotateVector(const vec3 vector, const float angle);
vec2 screenCoordToWorldCoord(const vec2 screencoord);
vec3 calculatePointLight(const vec4 DiffuseColor, vec3 LightPos, const vec3 Falloff, const vec4 LightColor, const vec4 AmbientColor);



const float stdratio = WORLD_HEIGHT / WORLD_WIDTH;
vec2 worldFragCoord;
const float worldRatio_W_DIV_H = WORLD_WIDTH / WORLD_HEIGHT;
const vec2 worldSize = vec2(WORLD_WIDTH, WORLD_HEIGHT);



// Light variables 
vec3 NormalMap;
vec2 worldFragCoordNormalized;
vec3 N;


void main() {

    worldFragCoord = screenCoordToWorldCoord(gl_FragCoord.xy);

    // Common light calculations
    NormalMap = texture2D(u_texture2, vTexCoord).rgb;
    worldFragCoordNormalized = worldFragCoord/vec2(1440.0, 2560.0);
    N = normalize(NormalMap * 2.0 - 1.0);


    vec4 DiffuseColor = texture2D(u_texture1, vTexCoord);
    vec2 fragcoord = gl_FragCoord.xy;

    vec3 pointcolor = getPointLightColor(DiffuseColor);



    vec4 finalColor;

    // green channel of vColor indicates hit

    if (vColor.g > 0.0 && vColor.a == 0.0) {
        vec4 fragcol = vec4(pointcolor, DiffuseColor.a);
        vec4 addColor;
        if (vColor.g > 0.67)
            addColor = vec4(1.0,1.0,1.0, DiffuseColor.a*vColor.g);
        else if (vColor.g > 0.52)
            addColor = vec4(1.0,0.0,0.0, DiffuseColor.a*vColor.g);
        else if (vColor.g > 0.37)
            addColor = vec4(0.0,0.0,1.0, DiffuseColor.a*vColor.g);
        else if (vColor.g > 0.22)
            addColor = vec4(1.0,1.0,0.0, DiffuseColor.a*vColor.g);
        else
            addColor = vec4(0.0,1.0,1.0, DiffuseColor.a*vColor.g);

        finalColor = addColor*addColor.a + fragcol*(1.0-addColor.a);
    }
    else
        finalColor = vec4(pointcolor, DiffuseColor.a);



    gl_FragColor = finalColor;

}



vec3 rotateVector(const vec3 vector, const float angle){

    float degree = radians(360.0*angle); // Angle is normalized to 0 - 1

    float cos_ = cos(degree);
    float sin_ = sin(degree);

    return vec3(vector.x*cos_ - vector.y*sin_, vector.x*sin_ + vector.y*cos_, vector.z);
}



vec3 calculatePointLight(const vec4 DiffuseColor, vec3 LightPos, const vec3 Falloff, const vec4 LightColor, const vec4 AmbientColor){


    if (LightPos.x == 0.0 && LightPos.y == 0.0)
        return vec3(0.0);



    LightPos.xy = LightPos.xy / worldSize;


    //The delta position of light
    vec3 LightDir = vec3(LightPos.xy - worldFragCoordNormalized, LightPos.z);

    //Correct for aspect ratio
    LightDir.x *= worldRatio_W_DIV_H;

    //Determine distance (used for attenuation) BEFORE we normalize our LightDir
    float D = length(LightDir);

    //normalize our vectors

    vec3 L = normalize(LightDir);
    vec3 NN = N;
    if (vColor.a == 0.0)
        NN = normalize(rotateVector(NN, vColor.r));

    //Pre-multiply light color with intensity
    //Then perform "NN dot L" to determine our diffuse term
    vec3 Diffuse = (LightColor.rgb * LightColor.a) * max(dot(NN, L), 0.0);

    //pre-multiply ambient color with intensity
    vec3 Ambient = AmbientColor.rgb * AmbientColor.a;

    //calculate attenuation
    float Attenuation = 1.0 / ( Falloff.x + (Falloff.y*D) + (Falloff.z*D*D) );

    //the calculation which brings it all together
    vec3 Intensity = Ambient + Diffuse * Attenuation;
    vec3 FinalColor = DiffuseColor.rgb * Intensity;


    return FinalColor;
}



vec3 getPointLightColor(const vec4 DiffuseColor){

    vec3 sum = vec3(0.0);

    for (int i = 0; i < MAX_NUM_POINTLIGHTS; i++)
    {
        sum += calculatePointLight(DiffuseColor, lpos[i], foff[i], lcol[i], acol[i]);
    }

    return sum;

}




vec2 screenCoordToWorldCoord(const vec2 screencoord){
    float ratio = Resolution.y / Resolution.x;

    vec2 resCoord;
    if (ratio == stdratio){
        // Ratio is standard
        resCoord = screencoord * (WORLD_HEIGHT / Resolution.y);
    } else if (ratio > stdratio) {
        // Screen gets extended vertically (black bars top/bottom)

        float screenheight = Resolution.x * stdratio;
        float bottom = (Resolution.y - screenheight) / 2.0;
        resCoord = vec2(screencoord.x, screencoord.y - bottom);
        resCoord *= (WORLD_WIDTH / Resolution.x);

    } else {
        // Screen gets extended horizontally (black bars left/right)

        float screenwidth = Resolution.y / stdratio;
        float left = (Resolution.x - screenwidth) / 2.0;
        resCoord = vec2(screencoord.x - left, screencoord.y);
        resCoord *= (WORLD_HEIGHT / Resolution.y);

    }

    return resCoord;
}

Upvotes: 0

Views: 597

Answers (1)

Columbo
Columbo

Reputation: 6766

Here's what I'd do to tackle it:

  1. Remove screenCoordToWorldCoord. It's a simple transformation, you could do it with a matrix multiply or a couple of dot products, or better still, move the work to the vertex shader and pass the results in a varying rather than constructing from gl_FragCoord.

  2. Compile a different version of the shader for each light count, and unroll the for loop. You get to drop the if at the top of calculatePointLight too.

  3. Remove all your remaining if statements - some devices hate conditionals. Do the logic with maths instead, the step function helps.

  4. Is there a way to drop rotateVector? I can't quite figure out what it's doing, but it's expensive and feels like it's something that shouldn't be necessary in the fragment shader. At the very least, it needn't be in the inner loop because the result is the same regardless of the light. It might be better done with some sort of matrix multiply rather than using sin/cos.

  5. Use precision correctly. Some devices can do maths much faster at lowp/mediump than at highp. Rule of thumb - lowp for colours, mediump for normals, highp for positions.

  6. Do some light culling on the CPU. I guess not all the lights affect every pixel. If you can chop your scene up into tiles and only let the most important lights count then you can do a lot less work.

  7. LightPos.xy / worldSize looks like you could do it once on the CPU instead of once per pixel.

No quick fix I'm afraid.

Upvotes: 2

Related Questions