Reputation: 782
I am facing some performance issues on some phones with my lightshader. The shader calculates linelights and pointlights based on normalmaps. It is also indicating hits on targets (it's a 2D top-down shooter game).
I am developing on a Sony Xperia Z5 Compact, where everything works just fine. Then I tried it on my very old Samsung Galaxy S2, where it is really slow. I didn't care because of the age of the phone.
Now I tried it on a Galaxy S4 and a Galaxy S5, but it doesn't seem to run ANY faster than on the Galaxy S2. I also had huuuuge compilation times (around 90 seconds), but I managed to bring that down to like a few seconds by optimizing the code (I think it's still messy though, I am not really into shaders).
I really don't know whats the bottleneck here and why it isn't running any faster on those S4's and S5's.
This is my shader:
#ifdef GL_ES
#define LOWP lowp
precision highp float;
#else
#define LOWP
#endif
#define MAX_NUM_POINTLIGHTS 16
uniform vec3 lpos[MAX_NUM_POINTLIGHTS]; // Light position
uniform vec3 foff[MAX_NUM_POINTLIGHTS]; // Light falloff
uniform vec4 acol[MAX_NUM_POINTLIGHTS]; // Ambient color
uniform vec4 lcol[MAX_NUM_POINTLIGHTS]; // Light color
//textures
uniform sampler2D u_texture1; // diffuse texture
uniform sampler2D u_texture2; // normalmap
varying vec4 vColor;
varying vec2 vTexCoord;
varying float vFlags;
uniform vec2 Resolution; //resolution of screen
const float WORLD_WIDTH = 1440.0;
const float WORLD_HEIGHT = 2560.0;
vec3 getPointLightColor(const vec4);
vec3 rotateVector(const vec3 vector, const float angle);
vec2 screenCoordToWorldCoord(const vec2 screencoord);
vec3 calculatePointLight(const vec4 DiffuseColor, vec3 LightPos, const vec3 Falloff, const vec4 LightColor, const vec4 AmbientColor);
const float stdratio = WORLD_HEIGHT / WORLD_WIDTH;
vec2 worldFragCoord;
const float worldRatio_W_DIV_H = WORLD_WIDTH / WORLD_HEIGHT;
const vec2 worldSize = vec2(WORLD_WIDTH, WORLD_HEIGHT);
// Light variables
vec3 NormalMap;
vec2 worldFragCoordNormalized;
vec3 N;
void main() {
worldFragCoord = screenCoordToWorldCoord(gl_FragCoord.xy);
// Common light calculations
NormalMap = texture2D(u_texture2, vTexCoord).rgb;
worldFragCoordNormalized = worldFragCoord/vec2(1440.0, 2560.0);
N = normalize(NormalMap * 2.0 - 1.0);
vec4 DiffuseColor = texture2D(u_texture1, vTexCoord);
vec2 fragcoord = gl_FragCoord.xy;
vec3 pointcolor = getPointLightColor(DiffuseColor);
vec4 finalColor;
// green channel of vColor indicates hit
if (vColor.g > 0.0 && vColor.a == 0.0) {
vec4 fragcol = vec4(pointcolor, DiffuseColor.a);
vec4 addColor;
if (vColor.g > 0.67)
addColor = vec4(1.0,1.0,1.0, DiffuseColor.a*vColor.g);
else if (vColor.g > 0.52)
addColor = vec4(1.0,0.0,0.0, DiffuseColor.a*vColor.g);
else if (vColor.g > 0.37)
addColor = vec4(0.0,0.0,1.0, DiffuseColor.a*vColor.g);
else if (vColor.g > 0.22)
addColor = vec4(1.0,1.0,0.0, DiffuseColor.a*vColor.g);
else
addColor = vec4(0.0,1.0,1.0, DiffuseColor.a*vColor.g);
finalColor = addColor*addColor.a + fragcol*(1.0-addColor.a);
}
else
finalColor = vec4(pointcolor, DiffuseColor.a);
gl_FragColor = finalColor;
}
vec3 rotateVector(const vec3 vector, const float angle){
float degree = radians(360.0*angle); // Angle is normalized to 0 - 1
float cos_ = cos(degree);
float sin_ = sin(degree);
return vec3(vector.x*cos_ - vector.y*sin_, vector.x*sin_ + vector.y*cos_, vector.z);
}
vec3 calculatePointLight(const vec4 DiffuseColor, vec3 LightPos, const vec3 Falloff, const vec4 LightColor, const vec4 AmbientColor){
if (LightPos.x == 0.0 && LightPos.y == 0.0)
return vec3(0.0);
LightPos.xy = LightPos.xy / worldSize;
//The delta position of light
vec3 LightDir = vec3(LightPos.xy - worldFragCoordNormalized, LightPos.z);
//Correct for aspect ratio
LightDir.x *= worldRatio_W_DIV_H;
//Determine distance (used for attenuation) BEFORE we normalize our LightDir
float D = length(LightDir);
//normalize our vectors
vec3 L = normalize(LightDir);
vec3 NN = N;
if (vColor.a == 0.0)
NN = normalize(rotateVector(NN, vColor.r));
//Pre-multiply light color with intensity
//Then perform "NN dot L" to determine our diffuse term
vec3 Diffuse = (LightColor.rgb * LightColor.a) * max(dot(NN, L), 0.0);
//pre-multiply ambient color with intensity
vec3 Ambient = AmbientColor.rgb * AmbientColor.a;
//calculate attenuation
float Attenuation = 1.0 / ( Falloff.x + (Falloff.y*D) + (Falloff.z*D*D) );
//the calculation which brings it all together
vec3 Intensity = Ambient + Diffuse * Attenuation;
vec3 FinalColor = DiffuseColor.rgb * Intensity;
return FinalColor;
}
vec3 getPointLightColor(const vec4 DiffuseColor){
vec3 sum = vec3(0.0);
for (int i = 0; i < MAX_NUM_POINTLIGHTS; i++)
{
sum += calculatePointLight(DiffuseColor, lpos[i], foff[i], lcol[i], acol[i]);
}
return sum;
}
vec2 screenCoordToWorldCoord(const vec2 screencoord){
float ratio = Resolution.y / Resolution.x;
vec2 resCoord;
if (ratio == stdratio){
// Ratio is standard
resCoord = screencoord * (WORLD_HEIGHT / Resolution.y);
} else if (ratio > stdratio) {
// Screen gets extended vertically (black bars top/bottom)
float screenheight = Resolution.x * stdratio;
float bottom = (Resolution.y - screenheight) / 2.0;
resCoord = vec2(screencoord.x, screencoord.y - bottom);
resCoord *= (WORLD_WIDTH / Resolution.x);
} else {
// Screen gets extended horizontally (black bars left/right)
float screenwidth = Resolution.y / stdratio;
float left = (Resolution.x - screenwidth) / 2.0;
resCoord = vec2(screencoord.x - left, screencoord.y);
resCoord *= (WORLD_HEIGHT / Resolution.y);
}
return resCoord;
}
Upvotes: 0
Views: 597
Reputation: 6766
Here's what I'd do to tackle it:
Remove screenCoordToWorldCoord
. It's a simple transformation, you could do it with a matrix multiply or a couple of dot products, or better still, move the work to the vertex shader and pass the results in a varying rather than constructing from gl_FragCoord
.
Compile a different version of the shader for each light count, and unroll the for
loop. You get to drop the if
at the top of calculatePointLight
too.
Remove all your remaining if
statements - some devices hate conditionals. Do the logic with maths instead, the step function helps.
Is there a way to drop rotateVector
? I can't quite figure out what it's doing, but it's expensive and feels like it's something that shouldn't be necessary in the fragment shader. At the very least, it needn't be in the inner loop because the result is the same regardless of the light. It might be better done with some sort of matrix multiply rather than using sin
/cos
.
Use precision correctly. Some devices can do maths much faster at lowp
/mediump
than at highp
. Rule of thumb - lowp
for colours, mediump
for normals, highp
for positions.
Do some light culling on the CPU. I guess not all the lights affect every pixel. If you can chop your scene up into tiles and only let the most important lights count then you can do a lot less work.
LightPos.xy / worldSize
looks like you could do it once on the CPU instead of once per pixel.
No quick fix I'm afraid.
Upvotes: 2