Asynchronous Graph Culling and Frame-Independent Task Scheduling#2887
Draft
douira wants to merge 3 commits intoCaffeineMC:devfrom
Draft
Asynchronous Graph Culling and Frame-Independent Task Scheduling#2887douira wants to merge 3 commits intoCaffeineMC:devfrom
douira wants to merge 3 commits intoCaffeineMC:devfrom
Conversation
This was referenced Nov 22, 2024
94885b5 to
158cd78
Compare
1ad3a3f to
f969cbb
Compare
ecdd040 to
df32670
Compare
29caa4b to
9c44397
Compare
3f21cde to
99bd356
Compare
0c8a3d9 to
d4269aa
Compare
7721195 to
927caf6
Compare
927caf6 to
ea659d1
Compare
ea659d1 to
f94b366
Compare
a3b65f1 to
db115a2
Compare
douira
added a commit
to douira/sodium
that referenced
this pull request
May 29, 2025
douira
added a commit
to douira/sodium
that referenced
this pull request
May 30, 2025
douira
added a commit
to douira/sodium
that referenced
this pull request
May 31, 2025
…ask scheduling (CaffeineMC#2887 merge item 1)
douira
added a commit
to douira/sodium
that referenced
this pull request
May 31, 2025
…n the camera is outside the graph (CaffeineMC#2887 merge item 2)
|
Will this PR be fixed for extended render distance mods such as C2ME? Rendering engine just explodes and crashes when using it. |
Contributor
Author
|
Please come to our discord server and give a more detailed report of the issue you're experiencing in the appropriate testing thread. |
Discord is blocked in my country, may i report it here? |
douira
added a commit
to douira/sodium
that referenced
this pull request
Jul 24, 2025
…ask scheduling (CaffeineMC#2887 merge item 1)
douira
added a commit
to douira/sodium
that referenced
this pull request
Aug 26, 2025
…n the camera is outside the graph (CaffeineMC#2887 merge item 2)
douira
added a commit
to douira/sodium
that referenced
this pull request
Aug 28, 2025
…eration (CaffeineMC#2887) This commit includes the remaining merge items needed for full async culling and render list generation. It also takes advantage of the separately introduced upload time limit for task scheduling.
douira
added a commit
to douira/sodium
that referenced
this pull request
Aug 28, 2025
…eration (CaffeineMC#2887) This commit includes the remaining merge items needed for full async culling and render list generation. It also takes advantage of the separately introduced upload time limit for task scheduling.
db115a2 to
bab8281
Compare
Contributor
Author
|
I've rebased this on top of the current |
douira
added a commit
to douira/sodium
that referenced
this pull request
Sep 12, 2025
…eration (CaffeineMC#2887) This commit includes the remaining merge items needed for full async culling and render list generation. It also takes advantage of the separately introduced upload time limit for task scheduling.
bab8281 to
53b0418
Compare
MoePus
added a commit
to MoePus/sodium
that referenced
this pull request
Nov 10, 2025
…ask scheduling (CaffeineMC#2887 merge item 1) # Conflicts: # common/src/main/java/net/caffeinemc/mods/sodium/client/gui/SodiumGameOptionPages.java # common/src/main/java/net/caffeinemc/mods/sodium/client/gui/SodiumGameOptions.java # common/src/main/java/net/caffeinemc/mods/sodium/client/render/chunk/ChunkUpdateType.java # common/src/main/java/net/caffeinemc/mods/sodium/client/render/chunk/RenderSectionManager.java # common/src/main/java/net/caffeinemc/mods/sodium/client/render/chunk/compile/tasks/ChunkBuilderMeshingTask.java # common/src/main/java/net/caffeinemc/mods/sodium/client/render/chunk/compile/tasks/ChunkBuilderSortingTask.java # common/src/main/java/net/caffeinemc/mods/sodium/client/render/chunk/lists/VisibleChunkCollector.java # common/src/main/java/net/caffeinemc/mods/sodium/client/render/chunk/translucent_sorting/data/DynamicBSPData.java # common/src/main/java/net/caffeinemc/mods/sodium/client/render/chunk/translucent_sorting/data/DynamicTopoData.java # common/src/main/resources/assets/sodium/lang/en_us.json
MoePus
pushed a commit
to MoePus/sodium
that referenced
this pull request
Nov 10, 2025
…n the camera is outside the graph (CaffeineMC#2887 merge item 2)
MoePus
added a commit
to MoePus/sodium
that referenced
this pull request
Nov 11, 2025
…ask scheduling (CaffeineMC#2887 merge item 1) # Conflicts: # common/src/main/java/net/caffeinemc/mods/sodium/client/gui/SodiumGameOptionPages.java # common/src/main/java/net/caffeinemc/mods/sodium/client/gui/SodiumGameOptions.java # common/src/main/java/net/caffeinemc/mods/sodium/client/render/chunk/ChunkUpdateType.java # common/src/main/java/net/caffeinemc/mods/sodium/client/render/chunk/RenderSectionManager.java # common/src/main/java/net/caffeinemc/mods/sodium/client/render/chunk/compile/tasks/ChunkBuilderMeshingTask.java # common/src/main/java/net/caffeinemc/mods/sodium/client/render/chunk/compile/tasks/ChunkBuilderSortingTask.java # common/src/main/java/net/caffeinemc/mods/sodium/client/render/chunk/lists/VisibleChunkCollector.java # common/src/main/java/net/caffeinemc/mods/sodium/client/render/chunk/translucent_sorting/data/DynamicBSPData.java # common/src/main/java/net/caffeinemc/mods/sodium/client/render/chunk/translucent_sorting/data/DynamicTopoData.java # common/src/main/resources/assets/sodium/lang/en_us.json
MoePus
pushed a commit
to MoePus/sodium
that referenced
this pull request
Nov 11, 2025
…n the camera is outside the graph (CaffeineMC#2887 merge item 2)
douira
added a commit
to douira/sodium
that referenced
this pull request
Jan 18, 2026
…eration (CaffeineMC#2887) This commit includes the remaining merge items needed for full async culling and render list generation. It also takes advantage of the separately introduced upload time limit for task scheduling. Currently, this regresses dev by not implementing immediate chunk presentation and flawless frames may also be broken as a result.
53b0418 to
93fd31e
Compare
douira
added a commit
to douira/sodium
that referenced
this pull request
Jan 18, 2026
…eration (CaffeineMC#2887) This commit includes the remaining merge items needed for full async culling and render list generation. It also takes advantage of the separately introduced upload time limit for task scheduling. Currently, this regresses dev by not implementing immediate chunk presentation and flawless frames may also be broken as a result.
93fd31e to
446d00b
Compare
douira
added a commit
to douira/sodium
that referenced
this pull request
Jan 18, 2026
…eration (CaffeineMC#2887) This commit includes the remaining merge items needed for full async culling and render list generation. It also takes advantage of the separately introduced upload time limit for task scheduling. Currently, this regresses dev by not implementing immediate chunk presentation and flawless frames may also be broken as a result.
aaa2696 to
1b77fa7
Compare
douira
added a commit
to douira/sodium
that referenced
this pull request
Feb 28, 2026
make ray culling safer but also less effective fix initialization error when inside a dead end section fix incremental diamond initialization in combination with angular culling fix out-of-world rendering to properly disable new angle technique microoptimization: ensuring capacity on every visit costs ~25ns track 10-bit quantized min/max angles over 3 planes packed in a long ~60% more cycles during BFS (~0.5ms, 225->355ns per section) in exchange for 5-25% more chunks culled. attempt to track occlusions using a bitset for angles and a LUT it's terrible, only culling 3%, but it's fast, only taking 60ns squeeze a bit more micro-optimization out of intersectSlopes trim some branches rewrite slope code to use all integer ops and no divides add slope-refinement based occlusion culling to cull ~25% more chunks Co-authored-by: Ryan Hitchman <hitchmanr@gmail.com> reenable ray occlusion. known issue: in rare cases some sections are culled when they shouldn't be fix comment Fix directional visibility calculation when bfs is non-frustum culled or wide extract angle visibility mask calculation into occlusion visitor to account for expanded non-occlusion zone when reusing the result for the whole section or in wide-bfs mode Implementation of asynchronous culling and tree-based render list generation (CaffeineMC#2887) This commit includes the remaining merge items needed for full async culling and render list generation. It also takes advantage of the separately introduced upload time limit for task scheduling. Currently, this regresses dev by not implementing immediate chunk presentation and flawless frames may also be broken as a result. remove GraphDirection.OPPOSITE since we can also just do a bitwise XOR on the direction index propagate symmetry into the number of visibility data sets needed, reduce stack size to 3*16 since paths can no longer fold back in on themselves use symmetry to avoid doing half the work in generating visibility data use bitfield index based neighbor traversal instead of conversion to coordinate triplets in vis graph construction, skip origin directions that are opposite the allowed step directions since they cannot lead to any visibility Remove debug code, undo unnecessary changes, delete unused code Avoid continuing the search if all possible destination faces have already been reached Don't generate full visibility data arrays if they're just the same everywhere Fix crash when initializing air chunks and they have no visibility data Perspective based occlusion culling Improve accuracy of visibility data by not allowing the traversal to go backwards
c0802ad to
c79fa5a
Compare
make ray culling safer but also less effective fix initialization error when inside a dead end section fix incremental diamond initialization in combination with angular culling fix out-of-world rendering to properly disable new angle technique microoptimization: ensuring capacity on every visit costs ~25ns track 10-bit quantized min/max angles over 3 planes packed in a long ~60% more cycles during BFS (~0.5ms, 225->355ns per section) in exchange for 5-25% more chunks culled. attempt to track occlusions using a bitset for angles and a LUT it's terrible, only culling 3%, but it's fast, only taking 60ns squeeze a bit more micro-optimization out of intersectSlopes trim some branches rewrite slope code to use all integer ops and no divides add slope-refinement based occlusion culling to cull ~25% more chunks Co-authored-by: Ryan Hitchman <hitchmanr@gmail.com> reenable ray occlusion. known issue: in rare cases some sections are culled when they shouldn't be fix comment Fix directional visibility calculation when bfs is non-frustum culled or wide extract angle visibility mask calculation into occlusion visitor to account for expanded non-occlusion zone when reusing the result for the whole section or in wide-bfs mode Implementation of asynchronous culling and tree-based render list generation (CaffeineMC#2887) This commit includes the remaining merge items needed for full async culling and render list generation. It also takes advantage of the separately introduced upload time limit for task scheduling. Currently, this regresses dev by not implementing immediate chunk presentation and flawless frames may also be broken as a result. remove GraphDirection.OPPOSITE since we can also just do a bitwise XOR on the direction index propagate symmetry into the number of visibility data sets needed, reduce stack size to 3*16 since paths can no longer fold back in on themselves use symmetry to avoid doing half the work in generating visibility data use bitfield index based neighbor traversal instead of conversion to coordinate triplets in vis graph construction, skip origin directions that are opposite the allowed step directions since they cannot lead to any visibility Remove debug code, undo unnecessary changes, delete unused code Avoid continuing the search if all possible destination faces have already been reached Don't generate full visibility data arrays if they're just the same everywhere Fix crash when initializing air chunks and they have no visibility data Perspective based occlusion culling Improve accuracy of visibility data by not allowing the traversal to go backwards
c79fa5a to
fc42a69
Compare
douira
added a commit
to douira/sodium
that referenced
this pull request
Mar 24, 2026
make ray culling safer but also less effective fix initialization error when inside a dead end section fix incremental diamond initialization in combination with angular culling fix out-of-world rendering to properly disable new angle technique microoptimization: ensuring capacity on every visit costs ~25ns track 10-bit quantized min/max angles over 3 planes packed in a long ~60% more cycles during BFS (~0.5ms, 225->355ns per section) in exchange for 5-25% more chunks culled. attempt to track occlusions using a bitset for angles and a LUT it's terrible, only culling 3%, but it's fast, only taking 60ns squeeze a bit more micro-optimization out of intersectSlopes trim some branches rewrite slope code to use all integer ops and no divides add slope-refinement based occlusion culling to cull ~25% more chunks Co-authored-by: Ryan Hitchman <hitchmanr@gmail.com> reenable ray occlusion. known issue: in rare cases some sections are culled when they shouldn't be fix comment Fix directional visibility calculation when bfs is non-frustum culled or wide extract angle visibility mask calculation into occlusion visitor to account for expanded non-occlusion zone when reusing the result for the whole section or in wide-bfs mode Implementation of asynchronous culling and tree-based render list generation (CaffeineMC#2887) This commit includes the remaining merge items needed for full async culling and render list generation. It also takes advantage of the separately introduced upload time limit for task scheduling. Currently, this regresses dev by not implementing immediate chunk presentation and flawless frames may also be broken as a result. remove GraphDirection.OPPOSITE since we can also just do a bitwise XOR on the direction index propagate symmetry into the number of visibility data sets needed, reduce stack size to 3*16 since paths can no longer fold back in on themselves use symmetry to avoid doing half the work in generating visibility data use bitfield index based neighbor traversal instead of conversion to coordinate triplets in vis graph construction, skip origin directions that are opposite the allowed step directions since they cannot lead to any visibility Remove debug code, undo unnecessary changes, delete unused code Avoid continuing the search if all possible destination faces have already been reached Don't generate full visibility data arrays if they're just the same everywhere Fix crash when initializing air chunks and they have no visibility data Perspective based occlusion culling Improve accuracy of visibility data by not allowing the traversal to go backwards
douira
added a commit
to douira/sodium
that referenced
this pull request
Mar 25, 2026
make ray culling safer but also less effective fix initialization error when inside a dead end section fix incremental diamond initialization in combination with angular culling fix out-of-world rendering to properly disable new angle technique microoptimization: ensuring capacity on every visit costs ~25ns track 10-bit quantized min/max angles over 3 planes packed in a long ~60% more cycles during BFS (~0.5ms, 225->355ns per section) in exchange for 5-25% more chunks culled. attempt to track occlusions using a bitset for angles and a LUT it's terrible, only culling 3%, but it's fast, only taking 60ns squeeze a bit more micro-optimization out of intersectSlopes trim some branches rewrite slope code to use all integer ops and no divides add slope-refinement based occlusion culling to cull ~25% more chunks Co-authored-by: Ryan Hitchman <hitchmanr@gmail.com> reenable ray occlusion. known issue: in rare cases some sections are culled when they shouldn't be fix comment Fix directional visibility calculation when bfs is non-frustum culled or wide extract angle visibility mask calculation into occlusion visitor to account for expanded non-occlusion zone when reusing the result for the whole section or in wide-bfs mode Implementation of asynchronous culling and tree-based render list generation (CaffeineMC#2887) This commit includes the remaining merge items needed for full async culling and render list generation. It also takes advantage of the separately introduced upload time limit for task scheduling. Currently, this regresses dev by not implementing immediate chunk presentation and flawless frames may also be broken as a result. remove GraphDirection.OPPOSITE since we can also just do a bitwise XOR on the direction index propagate symmetry into the number of visibility data sets needed, reduce stack size to 3*16 since paths can no longer fold back in on themselves use symmetry to avoid doing half the work in generating visibility data use bitfield index based neighbor traversal instead of conversion to coordinate triplets in vis graph construction, skip origin directions that are opposite the allowed step directions since they cannot lead to any visibility Remove debug code, undo unnecessary changes, delete unused code Avoid continuing the search if all possible destination faces have already been reached Don't generate full visibility data arrays if they're just the same everywhere Fix crash when initializing air chunks and they have no visibility data Perspective based occlusion culling Improve accuracy of visibility data by not allowing the traversal to go backwards
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Uses an octree to generate render lists independently of the, now asynchronous, slow graph search.
Testing has not shown regressions and generally frame rate has improved a little if a system was not limited by render list generation, and a lot if it was. (see testing thread)
Companion PR in Iris: IrisShaders/Iris#2539 (outdated)
Prerequisite: #3484