user4161529
user4161529

Reputation:

OpenGL GLSL uniform branching vs. Multiple shaders

I've been reading many articles on uniform if statements that deal with branching to change the behavior of large shaders "uber shaders". I started on an uber shader (opengl lwjgl) but then I realized, the simple act of adding an if statement set by a uniform in the fragment shader that does simple calculations decreased my fps by 5 compared to seperate shaders without uniform if statements. I haven't set any cap to my fps limit, it's just refreshing as fast as possible. I'm about to add normal mapping and parrallax mapping and I can see two routes:

Uber vertex shader:

#version 400 core

layout(location = 0) in vec3 position;
layout(location = 1) in vec2 textureCoords;
layout(location = 2)in vec3 normal;
**UNIFORM float RenderFlag;** 


void main(void){

if(RenderFlag ==0){
 //Calculate outVariables for normal mapping to the fragment shader
}

if(RenderFlag ==1){
//Calcuate outVariables for parallax mapping to the fragment shader
}

gl_Position = MVPmatrix *vec4(position,1);



}

Uber fragment shader:

layout(location = 0) in vec3 position;
layout(location = 1) in vec2 textureCoords;
layout(location = 2)in vec3 normal;
**UNIFORM float RenderFlag;** 
**UNIFORM float reflectionFlag;** // if set either of the 2 render modes               
will have some reflection of the skybox added to it, like reflective   
surface.

void main(void){
if(RenderFlag ==0){
  //display normal mapping


  if(reflectionFlag){
     vec4 reflectColor = texture(cube_texture, ReflectDirR) ;
     //add reflection color to final color and output

  }

}
if(RenderFlag ==1){
//display parrallax mapping
if(reflectionFlag){
    vec4 reflectColor = texture(cube_texture, ReflectDirR) ;

   //add reflection color to final color and output
   }
}
gl_Position = MVPmatrix *vec4(position,1);



}

The benefit of this (for me) is simplicity in the flow, but makes the overall program more complex and i'm faced with ugly nested if statements. Also if I wanted to completely avoid if statements I would need 4 seperate shaders, one to handle each possible branch (Normal w/o reflection : Normal with reflection : Parrallax w/o reflection : Parrallax with reflection) just for one feature, reflection.

1: Does GLSL execute both branches and subsequent branches and calculates BOTH functions then outputs the correct one?

2: Instead of a uniform flag for the reflection should I remove the if statement in favor of calculating the reflection color irregardless and adding it to the final color if it is a relatively small operation with something like

finalColor = finalColor + reflectionColor * X 
where X = a uniform variable, if none X == 0, if Reflection X==some amount.

Upvotes: 8

Views: 4635

Answers (1)

Andon M. Coleman
Andon M. Coleman

Reputation: 43319

Right off the bat, let me point out that GL4 has added subroutines, which are sort of a combination of both things you discussed. However, unless you are using a massive number of permutations of a single basic shader that gets swapped out multiple times during a frame (as you might if you had some dynamic material system in a forward rendering engine), subroutines really are not a performance win. I've put some time and effort into this in my own work and I get worthwhile improvements on one particular hardware/driver combination, and no appreciable change (good or bad) on most others.

Why did I bring up subroutines? Mostly because you're discussing what amounts to micro optimization, and subroutines are a really good example of why it doesn't pay to invest a whole lot of time thinking about that until the very end of development. If you're struggling to meet some performance number and you've crossed every high-level optimization strategy off the list, then you can worry about this stuff.

That said, it's almost impossible to answer how GLSL executes your shader. It's just a high-level language; the underlying hardware architectures have changed several times over since GLSL was created. The latest generation of hardware has actual branch predication and some pretty complicated threading engines that GLSL 1.10 class hardware never had, some of which is actually exposed directly through compute shaders now.

You could run the numbers to see which strategy works best on your hardware, but I think you'll find it's the old micro optimization dilemma and you may not even get enough of a measurable difference in performance to make a guess which approach to take. Keep in mind "Uber shaders" are attractive for multiple reasons (not all performance related), none the least of which, you may have fewer and less complicated draw commands to batch. If there's no appreciable difference in performance consider the design that's simpler and easier to implement / maintain instead.

Upvotes: 5

Related Questions