Skip to content

GL3: ANGLE/D3D11 render-to-texture performance fixes#807

Draft
Copilot wants to merge 5 commits intocore4from
copilot/fix-render-to-texture-performance
Draft

GL3: ANGLE/D3D11 render-to-texture performance fixes#807
Copilot wants to merge 5 commits intocore4from
copilot/fix-render-to-texture-performance

Conversation

Copy link
Contributor

Copilot AI commented Mar 3, 2026

Render-to-texture still sluggish on Windows/ANGLE/D3D11 after initial glInvalidateFramebuffer addition. Desktop GL 3.3 path was also gambling on function pointer address instead of checking extension presence.

Extension detection

  • Check GL_ARB_invalidate_subdata extension string via CGlExtensions instead of testing nglInvalidateFramebuffer != NULL (function address can be non-null even when extension is absent)

Default framebuffer invalidation before swap

  • Invalidate depth/stencil on FBO 0 before present — depth/stencil is never needed across frames. Without this, ANGLE stores it during present (STORE_OP_STORE / expensive D3D11 resolve). With invalidation: DiscardView on D3D11, STORE_OP_DONT_CARE on Vulkan.
  • Uses GL_DEPTH/GL_STENCIL (not GL_DEPTH_ATTACHMENT/GL_STENCIL_ATTACHMENT) since FBO 0 is the default framebuffer.

Combined depth/stencil clear

  • Added IDriver::clearDepthStencil() (virtual, default falls through to individual clears)
  • GL3 override uses nglClearBufferfi(GL_DEPTH_STENCIL, ...) — single ClearDepthStencilView(DEPTH|STENCIL) on D3D11 instead of two separate calls
  • CDriverUser::clearBuffers() now uses the combined path

Remove unnecessary state change in clearZBuffer

  • Removed enableZWrite(true) before nglClearBufferfv(GL_DEPTH, ...)glClearBuffer* is unaffected by write masks per spec; the call was a wasted D3D11 depth-stencil state update

Remove bind-time invalidation (prior commit)

  • Invalidating depth/stencil at FBO bind time was counterproductive: it prevented ANGLE from folding the subsequent clearBuffers() into LOAD_OP_CLEAR, downgrading it to LOAD_OP_DONT_CARE + mid-pass vkCmdClearAttachments

Route FBO deactivation through texture-level method

  • setRenderTarget(NULL) now calls CTextureDrvInfosGL3::activeFrameBufferObject(NULL) instead of the driver-level fallback, so AttachDepthStencil is checked before invalidating

💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

When deactivating an FBO render target, call glInvalidateFramebuffer
to hint that depth/stencil data is no longer needed. Without this,
ANGLE (Windows/Android WebGL backend) performs an expensive
depth/stencil resolve/copy when switching away from the FBO, causing
severe render-to-texture performance degradation. This is especially
visible in the planar reflection sample.

The function is core in GLES 3.0 (always available on WebGL 2.0) and
optionally loaded on desktop GL 3.3+ (GL 4.3 / ARB_invalidate_subdata).

Co-authored-by: kaetemi <1581053+kaetemi@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix sluggish render-to-texture on Windows/Android Invalidate FBO depth/stencil on unbind to fix ANGLE render-to-texture perf Mar 3, 2026
The previous fix placed glInvalidateFramebuffer in
CTextureDrvInfosGL3::activeFrameBufferObject(NULL), but
setRenderTarget(NULL) calls CDriverGL3::activeFrameBufferObject(NULL)
which bypasses it entirely — going straight to forceBindFramebuffer(0).

Fix unbind path: Add invalidation in CDriverGL3::activeFrameBufferObject
  (driver_opengl3.cpp) which is the actual code path called by
  setRenderTarget(NULL). Tells ANGLE not to store depth/stencil data.

Fix bind path: Add invalidation in
  CTextureDrvInfosGL3::activeFrameBufferObject(tex) after binding
  the FBO. Tells ANGLE not to load previous depth/stencil data into
  the new render pass (VK_ATTACHMENT_LOAD_OP_DONT_CARE on Vulkan
  backend, avoids blit on D3D11 backend).

Co-authored-by: kaetemi <1581053+kaetemi@users.noreply.github.com>
Copilot AI changed the title Invalidate FBO depth/stencil on unbind to fix ANGLE render-to-texture perf Fix FBO depth/stencil invalidation on ANGLE — was placed in dead code path Mar 3, 2026
…dation

1. Add proper GL_ARB_invalidate_subdata extension detection instead
   of gambling on the function address. The extension flag is checked
   via the standard extension string mechanism (setupARBInvalidateSubdata
   + CHECK_EXT_2), not by testing whether nglGetProcAddress returned
   non-null (which can succeed even when the extension is absent).

2. Remove bind-time glInvalidateFramebuffer in activeFrameBufferObject(tex).
   Calling invalidate right after binding prevents ANGLE from merging the
   subsequent clearBuffers() into a LOAD_OP_CLEAR render pass. Instead,
   ANGLE sees LOAD_OP_DONT_CARE (from invalidation) then issues a slower
   mid-pass vkCmdClearAttachments. Without the invalidation, ANGLE sees
   the clear first and uses the optimal LOAD_OP_CLEAR.

3. Keep unbind-time invalidation (STORE_OP_DONT_CARE) which is the
   actually beneficial hint — tells ANGLE not to resolve/store depth/stencil
   when switching away from the FBO.

Co-authored-by: kaetemi <1581053+kaetemi@users.noreply.github.com>
Copilot AI changed the title Fix FBO depth/stencil invalidation on ANGLE — was placed in dead code path FBO invalidation: proper extension detection, remove counterproductive bind-time hint Mar 3, 2026
Three targeted optimizations for ANGLE/D3D11 FBO render target performance:

1. Invalidate default framebuffer depth/stencil before swapBuffers.
   Depth/stencil data is never needed across frames. Without this hint,
   ANGLE stores it during present (STORE_OP_STORE on Vulkan, expensive
   resolve on D3D11). With invalidation, ANGLE can skip the store
   (DiscardView on D3D11, STORE_OP_DONT_CARE on Vulkan).

2. Add combined clearDepthStencil() using nglClearBufferfi.
   On ANGLE/D3D11 this maps to a single ClearDepthStencilView call with
   D3D11_CLEAR_DEPTH | D3D11_CLEAR_STENCIL, instead of two separate
   ClearDepthStencilView calls (one for depth, one for stencil).
   Used by CDriverUser::clearBuffers() which is the common path.

3. Remove unnecessary enableZWrite(true) from clearZBuffer.
   glClearBuffer* is not affected by the depth write mask per GL spec.
   The state change was wasting a D3D11 depth-stencil state update.

4. Route FBO deactivation through texture-level method.
   setRenderTarget(NULL) now goes through the per-FBO activeFrameBufferObject
   instead of the driver-level fallback, so the AttachDepthStencil flag is
   properly checked before invalidating (avoids unnecessary invalidation
   on FBOs without depth/stencil).

Co-authored-by: kaetemi <1581053+kaetemi@users.noreply.github.com>

Co-authored-by: kaetemi <1581053+kaetemi@users.noreply.github.com>
Copilot AI changed the title FBO invalidation: proper extension detection, remove counterproductive bind-time hint GL3: ANGLE/D3D11 render-to-texture performance fixes Mar 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants