🎉 Celebrating 25 Years of GameDev.net! 🎉

Not many can claim 25 years on the Internet! Join us in celebrating this milestone. Learn more about our history, and thank you for being a part of our community!

Stencil stipple

Started by
5 comments, last by JoeJ 1 year, 11 months ago

I'm doing some screen-space reflections that are pretty expensive. I'd like to render them to a half-size buffer, but they require a lot of input information, so I end up with a lot of color attachments on my first render buffer.

I was thinking maybe I could draw to every fourth pixel in the initial pass, but this is exactly the kind of problem GPU logic would probably not be efficient with. An if statement would likely result in all four pixels executing the expensive reflection code.

So I was thinking maybe I could use the stencil buffer to only render to one pixel for each 2x2 group. I have never used the stencil buffer before, and it seems like something that rarely gets used.

Is this a good idea for performance? How do I set this up? I just want a stencil pattern like this across the whole screen:

1 0
0 0

10x Faster Performance for VR: www.ultraengine.com

Advertisement

Josh Klint said:
So I was thinking maybe I could use the stencil buffer to only render to one pixel for each 2x2 group.

Someone else needs to confirm, but i don't think this would help. 3 out of 4 threads would be still idle.

But you could use compute shaders instead pixel shaders, then each thread can be assigned to pixels at will. Not sure if you'd miss some pixel shader features, though.

This is a weird problem. One one hand you can do a deferred pass at ¼ resolution, but it involves a lot of memory throughput. On the other hand, you can do all the calculation in the initial render, but it would require four times as many pixels to be processed.

10x Faster Performance for VR: www.ultraengine.com

Josh Klint said:
This is a weird problem. One one hand you can do a deferred pass at ¼ resolution, but it involves a lot of memory throughput. On the other hand, you can do all the calculation in the initial render, but it would require four times as many pixels to be processed.

What about downscaling the framebuffer, and doing SSR on that?
Though, not sure how much this still helps as rays diverge, but i guess it's faster, especially if you can use downscaled frame for other things as well.

So the question would be native resolution reflections but flicker issues vs upscaled reflections at half res?

JoeJ said:
What about downscaling the framebuffer, and doing SSR on that?

That involves rendering to four textures to get all the data you need.

It might even be faster to render the entire scene in another pass at a lower resolution.

10x Faster Performance for VR: www.ultraengine.com

Josh Klint said:
It might even be faster to render the entire scene in another pass at a lower resolution.

No way. Then you need to fetch all the vertices and textures again, many of that only to be overwritten from closer pixels. Doing all the transformations, triangle setups again, many threads at triangle edges being idle. And worst: doing all the lighting again.
If the scene has some minimal complexity, down sampling will be much faster than rendering. If it's just a spinning cube covering 10% of the screen, then rendering would be faster because clearing screen is half bandwidth than copying.

So i would not reject this. Afaik, almost all AAA games make a mip pyramid of the framebuffer to be used in multiple effects like SSR, SSAO, bloom, DOF. (Though, probably they can reduce this to just color and depth, not the whole G-buffer)
AMD has a sample on GPU Open showing generating multiple levels at once from the high res image (idling many threads) is faster than generating each level individually from the higher level (keeping all threads busy but causing more BW to read).

For SSR and SSAO a mip pyramid is a big optimization because instead taking many samples (rough reflections, larger AO distance) you can take just one sample from a prefiltered, lower mip level.

This topic is closed to new replies.

Advertisement