GL3: ANGLE/D3D11 render-to-texture performance fixes#807
Draft
GL3: ANGLE/D3D11 render-to-texture performance fixes#807
Conversation
When deactivating an FBO render target, call glInvalidateFramebuffer to hint that depth/stencil data is no longer needed. Without this, ANGLE (Windows/Android WebGL backend) performs an expensive depth/stencil resolve/copy when switching away from the FBO, causing severe render-to-texture performance degradation. This is especially visible in the planar reflection sample. The function is core in GLES 3.0 (always available on WebGL 2.0) and optionally loaded on desktop GL 3.3+ (GL 4.3 / ARB_invalidate_subdata). Co-authored-by: kaetemi <1581053+kaetemi@users.noreply.github.com>
Copilot
AI
changed the title
[WIP] Fix sluggish render-to-texture on Windows/Android
Invalidate FBO depth/stencil on unbind to fix ANGLE render-to-texture perf
Mar 3, 2026
The previous fix placed glInvalidateFramebuffer in CTextureDrvInfosGL3::activeFrameBufferObject(NULL), but setRenderTarget(NULL) calls CDriverGL3::activeFrameBufferObject(NULL) which bypasses it entirely — going straight to forceBindFramebuffer(0). Fix unbind path: Add invalidation in CDriverGL3::activeFrameBufferObject (driver_opengl3.cpp) which is the actual code path called by setRenderTarget(NULL). Tells ANGLE not to store depth/stencil data. Fix bind path: Add invalidation in CTextureDrvInfosGL3::activeFrameBufferObject(tex) after binding the FBO. Tells ANGLE not to load previous depth/stencil data into the new render pass (VK_ATTACHMENT_LOAD_OP_DONT_CARE on Vulkan backend, avoids blit on D3D11 backend). Co-authored-by: kaetemi <1581053+kaetemi@users.noreply.github.com>
Copilot
AI
changed the title
Invalidate FBO depth/stencil on unbind to fix ANGLE render-to-texture perf
Fix FBO depth/stencil invalidation on ANGLE — was placed in dead code path
Mar 3, 2026
…dation 1. Add proper GL_ARB_invalidate_subdata extension detection instead of gambling on the function address. The extension flag is checked via the standard extension string mechanism (setupARBInvalidateSubdata + CHECK_EXT_2), not by testing whether nglGetProcAddress returned non-null (which can succeed even when the extension is absent). 2. Remove bind-time glInvalidateFramebuffer in activeFrameBufferObject(tex). Calling invalidate right after binding prevents ANGLE from merging the subsequent clearBuffers() into a LOAD_OP_CLEAR render pass. Instead, ANGLE sees LOAD_OP_DONT_CARE (from invalidation) then issues a slower mid-pass vkCmdClearAttachments. Without the invalidation, ANGLE sees the clear first and uses the optimal LOAD_OP_CLEAR. 3. Keep unbind-time invalidation (STORE_OP_DONT_CARE) which is the actually beneficial hint — tells ANGLE not to resolve/store depth/stencil when switching away from the FBO. Co-authored-by: kaetemi <1581053+kaetemi@users.noreply.github.com>
Copilot
AI
changed the title
Fix FBO depth/stencil invalidation on ANGLE — was placed in dead code path
FBO invalidation: proper extension detection, remove counterproductive bind-time hint
Mar 3, 2026
Three targeted optimizations for ANGLE/D3D11 FBO render target performance: 1. Invalidate default framebuffer depth/stencil before swapBuffers. Depth/stencil data is never needed across frames. Without this hint, ANGLE stores it during present (STORE_OP_STORE on Vulkan, expensive resolve on D3D11). With invalidation, ANGLE can skip the store (DiscardView on D3D11, STORE_OP_DONT_CARE on Vulkan). 2. Add combined clearDepthStencil() using nglClearBufferfi. On ANGLE/D3D11 this maps to a single ClearDepthStencilView call with D3D11_CLEAR_DEPTH | D3D11_CLEAR_STENCIL, instead of two separate ClearDepthStencilView calls (one for depth, one for stencil). Used by CDriverUser::clearBuffers() which is the common path. 3. Remove unnecessary enableZWrite(true) from clearZBuffer. glClearBuffer* is not affected by the depth write mask per GL spec. The state change was wasting a D3D11 depth-stencil state update. 4. Route FBO deactivation through texture-level method. setRenderTarget(NULL) now goes through the per-FBO activeFrameBufferObject instead of the driver-level fallback, so the AttachDepthStencil flag is properly checked before invalidating (avoids unnecessary invalidation on FBOs without depth/stencil). Co-authored-by: kaetemi <1581053+kaetemi@users.noreply.github.com> Co-authored-by: kaetemi <1581053+kaetemi@users.noreply.github.com>
Copilot
AI
changed the title
FBO invalidation: proper extension detection, remove counterproductive bind-time hint
GL3: ANGLE/D3D11 render-to-texture performance fixes
Mar 3, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Render-to-texture still sluggish on Windows/ANGLE/D3D11 after initial
glInvalidateFramebufferaddition. Desktop GL 3.3 path was also gambling on function pointer address instead of checking extension presence.Extension detection
GL_ARB_invalidate_subdataextension string viaCGlExtensionsinstead of testingnglInvalidateFramebuffer != NULL(function address can be non-null even when extension is absent)Default framebuffer invalidation before swap
STORE_OP_STORE/ expensive D3D11 resolve). With invalidation:DiscardViewon D3D11,STORE_OP_DONT_CAREon Vulkan.GL_DEPTH/GL_STENCIL(notGL_DEPTH_ATTACHMENT/GL_STENCIL_ATTACHMENT) since FBO 0 is the default framebuffer.Combined depth/stencil clear
IDriver::clearDepthStencil()(virtual, default falls through to individual clears)nglClearBufferfi(GL_DEPTH_STENCIL, ...)— singleClearDepthStencilView(DEPTH|STENCIL)on D3D11 instead of two separate callsCDriverUser::clearBuffers()now uses the combined pathRemove unnecessary state change in clearZBuffer
enableZWrite(true)beforenglClearBufferfv(GL_DEPTH, ...)—glClearBuffer*is unaffected by write masks per spec; the call was a wasted D3D11 depth-stencil state updateRemove bind-time invalidation (prior commit)
clearBuffers()intoLOAD_OP_CLEAR, downgrading it toLOAD_OP_DONT_CARE+ mid-passvkCmdClearAttachmentsRoute FBO deactivation through texture-level method
setRenderTarget(NULL)now callsCTextureDrvInfosGL3::activeFrameBufferObject(NULL)instead of the driver-level fallback, soAttachDepthStencilis checked before invalidating💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.