Haydn V. Harach
Haydn V. Harach

Reputation: 1275

Are there any Order-Independent Transparency techniques suitable for deferred shading?

I've surveyed a number of order-independent transparency methods for my OpenGL engine, and at first I thought I would like to use weighted average blending, to maximize speed.

However, my engine uses deferred shading, and I need to take this into account when choosing a blending technique. Ideally I would like a technique that wouldn't ask me to implement forward shading to use for translucent objects.

There are a number of cases where I need to use transparency:

I am willing to sacrifice image correctness for the sake of speed (Hence my initial choice of weighted average blending). I don't need every layer of translucent objects to be lit, but I would at least like for the front-most pixels to be properly lit.

I'm using OpenGL 3.x+ Core Context, so I would like to avoid anything that requires OpenGL 4.x (as lovely as it would be to use), but I can freely use anything that isn't available in OpenGL 2.x.

My question is: What is the best order-independent transparency technique for deferred shading?, and/or: what is the best way to light/shade a translucent object when using deferred shading?

P.S. is there a better way to render anti-aliased cutouts (grass/hair/leaves) that doesn't rely on blending? Pure alpha testing tends to produce ugly aliasing.

Upvotes: 4

Views: 3904

Answers (2)

Tab
Tab

Reputation: 101

The way I do it :

  • Render whole scene at low resolution with dithered rendering for transparent surfaces (aka inferred rendering), rotating dithering mask using the draw ID (any ID really, as long as it's unique in the frame), render Draw ID, WorldNormals, FragDepth (see Getting World Position from Depth Buffer Value), BRDF Alpha (see this) to a framebuffer
  • Do your lighting (only diffuse and specular), SSR or SSAO pass as usual
  • Render Opaque and "Cutoff" surfaces at normal resolution (aka "Material Pass") to "opaque" framebuffer (OFB)
  • Create 2 "transparent" framebuffers (FB0 and FB1), with FB1 being half FB0 resolution.
  • Render Transparent surfaces without blending to a FB0 with depth test/write enabled
  • Blit FB0 buffer depth bits to FB1
  • Using Blended OIT render transparent surfaces again to FB1 but with glDepthFunc(GL_GREATER) and glDepthMask(GL_FALSE), manually test opaque surfaces depth for discard in shader (a bit slower, but AFAIK you can't bind 2 depth buffers)
  • Generate mipmaps for OFB
  • Manually composite the farthest transparent surfaces from FB1 to OFB mip 0, sampling from OFB mip 1 and up in the shader (slower but allows for distortion and rough/colored transmission).
  • Generate mipmaps for OFB again
  • Composite the closest transparent surfaces from FB0 to OFB mip 0, sampling from OFB mip 1, which now contains transparent surfaces from FB1

See Morgan McGuire's blog for how to do compositing and how to construct your "transparent" framebuffers. Reconstruct transparent surfaces using your draw ID framebuffer and the Normals, I use a simple weighted average, with a weight corresponding to the current normal dot framebuffer's normal (a normal dot itself giving 1).

Downside :

  • It's not single pass but your first render and lighting pass are done at lower resolution so it does not hit the performance that bad, plus "farthest" transparent surfaces are half the normal resolution.
  • You only get 4 layers of transparency before falling back to IBL, If you're using spherical harmonics for dynamic lights as a fallback solution you're fine though. You could get 8 layers using larger dithering mask, but reconstructing the surface will be slower.

Upside :

  • It allows for tons of lights like any deferred renderer
  • Because you mix Deffered and Forward rendering, it allows for per-material environment and BRDF Lookup Table, which is handy
  • Lookup-heavy techniques such as SSR and SSAO are done at lower resolution, which helps performance even more.
  • Like SSR and SSAO, Lighting is done at low res
  • This allows for fancy effects like screen-space refraction and transparent surfaces transmission.
  • Having two transparency layers means refractive surfaces can distort themselves, though only closest surfaces can distort surfaces behind them (otherwise you get ugly artifacts)

My weight function for Blended OIT (closer surfaces with the same opacity always get higher weight) :

void WritePixel(vec3 premultipliedReflect, float coverage)
{
    float z = abs(CameraSpaceDepth);
    float w = clamp(pow(abs(1 / z), 4.f) * coverage * coverage, 6.1*1e-4, 1e5);

    out_0 = vec4(premultipliedReflect, coverage) * w;
    out_1 = vec4(1 - coverage); //so you can render without blending
}

My compositing function :

vec4 accum = texelFetch(in_Buffer0, ivec2(gl_FragCoord.xy), 0);
float r = texelFetch(in_Buffer1, ivec2(gl_FragCoord.xy), 0).r;
out_Buffer0 = vec4(accum.rgb / clamp(accum.a, 6.1*1e-4, 6.55*1e5), r);

See this about "CameraSpaceDepth" and this about fp values

Here is the result for this model with a dirty hacked-together POC, you can see rough surfaces transmission : Result

Upvotes: 0

Bim
Bim

Reputation: 1068

I'm not sure it fits your deferred renderer but you might consider weighted, blended order-independent transparency. There's an older version w/o colored transmisson (web) and a newer version that supports colored transmission (web) and a lot of other stuff. It is quite fast because it only uses one opaque, one transparency, and one composition pass and it works with OpenGL 3.2+.
I implemented the first version and it works quite well, depending on your scene and a properly tuned weighting function, but has problems with high alpha-values. I didn't get good results with the weighting functions from the papers, but only after using linear, normalized eye-space z-values.
Note that when using OpenGL < 4.0 you can not specify a blending function per buffer (glBlendFunci), so you need to work around that (see the first paper).

  • Set up frame buffer with these attachements:
    • 0: RGBA8, opaque
    • 1: RGBA16F, accumulation
    • 2: R16F, revealage
  • Clear attachment #0 to your screen clear color (r,g,b,1), #1 to (0,0,0,1) and #2 to 0.
  • Render opaque geometry to attachment #0 and depth buffer.

    glEnable(GL_DEPTH_TEST);
    glDepthMask(GL_TRUE);

  • Render transparent geometry to attachment #1 and #2. Turn off depth buffer writes, but leave depth testing enabled.

    glDepthMask(GL_FALSE);
    glEnable(GL_BLEND);
    glBlendEquation(GL_FUNC_ADD);
    glBlendFuncSeparate(GL_ONE, GL_ONE, GL_ZERO, GL_ONE_MINUS_SRC_ALPHA);

The fragment shader part writing the accumulation and revealage targets looks like this:

uniform mat4 projectionMatrix;

layout (location = 1) out vec4 accum;
layout (location = 2) out float revealage;

/// @param color Regular RGB reflective color of fragment, not pre-multiplied
/// @param alpha Alpha value of fragment
/// param wsZ Window-space-z value == gl_FragCoord.z
void writePixel(vec3 color, float alpha, float wsZ) {
    float ndcZ = 2.0 * wsZ - 1.0;
    // linearize depth for proper depth weighting
    //See: https://stackoverflow.com/questions/7777913/how-to-render-depth-linearly-in-modern-opengl-with-gl-fragcoord-z-in-fragment-sh
    //or: https://stackoverflow.com/questions/11277501/how-to-recover-view-space-position-given-view-space-depth-value-and-ndc-xy
    float linearZ = (projectionMatrix[2][2] + 1.0) * wsZ / (projectionMatrix[2][2] + ndcZ);
    float tmp = (1.0 - linearZ) * alpha;
    //float tmp = (1.0 - wsZ * 0.99) * alpha * 10.0; // <-- original weighting function from paper #2
    float w = clamp(tmp * tmp * tmp * tmp * tmp * tmp, 0.0001, 1000.0);
    accum = vec4(color * alpha* w, alpha);
    revealage = alpha * w;
}
  • Bind attachment textures #1 and #2 and composite them to attachment #0 by drawing a quad with a composition shader.

    glEnable(GL_BLEND);
    glBlendFunc(GL_ONE_MINUS_SRC_ALPHA, GL_SRC_ALPHA);

The fragment shader for composition looks like this:

uniform sampler2DMS accumTexture;
uniform sampler2DMS revealageTexture;

in vec2 texcoordVar;
out vec4 fragmentColor;

void main() {
    ivec2 bufferCoords = ivec2(gl_FragCoord.xy);
    vec4 accum = texelFetch(accumTexture, bufferCoords, 0);
    float revealage = accum.a;
    // save the blending and color texture fetch cost
    /*if (revealage == 1.0) {
        discard;
    }*/
    accum.a = texelFetch(revealageTexture, bufferCoords, 0).r;
    // suppress underflow
    if (isinf(accum.a)) {
        accum.a = max(max(accum.r, accum.g), accum.b);
    }
    // suppress overflow
    if (any(isinf(accum.rgb))) {
        accum = vec4(isinf(accum.a) ? 1.0 : accum.a);
    }
    vec3 averageColor = accum.rgb / max(accum.a, 1e-4);
    // dst' = (accum.rgb / accum.a) * (1 - revealage) + dst * revealage
    fragmentColor = vec4(averageColor, revealage);
}

Upvotes: 2

Related Questions