Skip to content

Asynchronous Graph Culling and Frame-Independent Task Scheduling#2887

Draft
douira wants to merge 3 commits intoCaffeineMC:devfrom
douira:decoupled-frustum-test
Draft

Asynchronous Graph Culling and Frame-Independent Task Scheduling#2887
douira wants to merge 3 commits intoCaffeineMC:devfrom
douira:decoupled-frustum-test

Conversation

@douira
Copy link
Copy Markdown
Contributor

@douira douira commented Nov 22, 2024

Uses an octree to generate render lists independently of the, now asynchronous, slow graph search.

  • It runs the occlusion culler in a separate thread, which allows a large speedup in render list generation time. Correct culling results are ensured with a combination of different types of BFS, synchronous occlusion culling is used as a last resort if the camera teleports or moves extremely quickly.
  • Tasks are ordered based on a combination score of how long they've been pending, their distance from the camera, their type, and whether they're visible in the camera frustum. How many tasks can be scheduled is now independent of the frame rate and instead are limited based on their estimated duration and size.
  • There's an upload limit to ensure not too many tasks are submitted that will exceed the upload buffer's size. This isn't a hard limit and this PR doesn't implement a new way of handling task buffers, to avoid expanding the scope too far.

Testing has not shown regressions and generally frame rate has improved a little if a system was not limited by render list generation, and a lot if it was. (see testing thread)

Companion PR in Iris: IrisShaders/Iris#2539 (outdated)
Prerequisite: #3484

@douira douira force-pushed the decoupled-frustum-test branch from 94885b5 to 158cd78 Compare November 25, 2024 19:12
@douira douira added the T-enhancement Type: Enhancement label Nov 26, 2024
@douira douira force-pushed the decoupled-frustum-test branch from 1ad3a3f to f969cbb Compare November 27, 2024 22:35
@jellysquid3 jellysquid3 added this to the Sodium 0.7 milestone Dec 2, 2024
@douira douira force-pushed the decoupled-frustum-test branch from ecdd040 to df32670 Compare December 7, 2024 22:31
@douira douira force-pushed the decoupled-frustum-test branch from 29caa4b to 9c44397 Compare December 31, 2024 04:41
@douira douira force-pushed the decoupled-frustum-test branch 2 times, most recently from 3f21cde to 99bd356 Compare January 19, 2025 16:06
@douira douira force-pushed the decoupled-frustum-test branch 4 times, most recently from 0c8a3d9 to d4269aa Compare February 18, 2025 19:25
@douira douira changed the title Asynchronous Render List Generation and Frame-Independent Task Scheduling Asynchronous Graph Culling and Frame-Independent Task Scheduling Feb 20, 2025
@douira douira force-pushed the decoupled-frustum-test branch 2 times, most recently from 7721195 to 927caf6 Compare February 22, 2025 00:25
@douira douira force-pushed the decoupled-frustum-test branch from 927caf6 to ea659d1 Compare April 6, 2025 12:59
@douira douira force-pushed the decoupled-frustum-test branch from ea659d1 to f94b366 Compare April 23, 2025 22:35
@douira douira force-pushed the decoupled-frustum-test branch 2 times, most recently from a3b65f1 to db115a2 Compare May 29, 2025 22:33
douira added a commit to douira/sodium that referenced this pull request May 29, 2025
douira added a commit to douira/sodium that referenced this pull request May 30, 2025
douira added a commit to douira/sodium that referenced this pull request May 31, 2025
douira added a commit to douira/sodium that referenced this pull request May 31, 2025
@ghost
Copy link
Copy Markdown

ghost commented Jun 1, 2025

Will this PR be fixed for extended render distance mods such as C2ME? Rendering engine just explodes and crashes when using it.

@douira
Copy link
Copy Markdown
Contributor Author

douira commented Jun 1, 2025

Please come to our discord server and give a more detailed report of the issue you're experiencing in the appropriate testing thread.

@ghost
Copy link
Copy Markdown

ghost commented Jun 2, 2025

Please come to our discord server and give a more detailed report of the issue you're experiencing in the appropriate testing thread.

Discord is blocked in my country, may i report it here?

douira added a commit to douira/sodium that referenced this pull request Jul 24, 2025
douira added a commit to douira/sodium that referenced this pull request Aug 26, 2025
jellysquid3 pushed a commit that referenced this pull request Aug 28, 2025
jellysquid3 pushed a commit that referenced this pull request Aug 28, 2025
douira added a commit to douira/sodium that referenced this pull request Aug 28, 2025
…eration (CaffeineMC#2887)

This commit includes the remaining merge items needed for full async culling and render list generation. It also takes advantage of the separately introduced upload time limit for task scheduling.
douira added a commit to douira/sodium that referenced this pull request Aug 28, 2025
…eration (CaffeineMC#2887)

This commit includes the remaining merge items needed for full async culling and render list generation. It also takes advantage of the separately introduced upload time limit for task scheduling.
@douira douira force-pushed the decoupled-frustum-test branch from db115a2 to bab8281 Compare August 28, 2025 21:36
@douira
Copy link
Copy Markdown
Contributor Author

douira commented Aug 28, 2025

I've rebased this on top of the current dev and it seems to work ok for me so far. It'll require more testing and tuning since the merge conflict was rather involved.

douira added a commit to douira/sodium that referenced this pull request Sep 12, 2025
…eration (CaffeineMC#2887)

This commit includes the remaining merge items needed for full async culling and render list generation. It also takes advantage of the separately introduced upload time limit for task scheduling.
@douira douira force-pushed the decoupled-frustum-test branch from bab8281 to 53b0418 Compare September 12, 2025 14:03
MoePus added a commit to MoePus/sodium that referenced this pull request Nov 10, 2025
…ask scheduling (CaffeineMC#2887 merge item 1)

# Conflicts:
#	common/src/main/java/net/caffeinemc/mods/sodium/client/gui/SodiumGameOptionPages.java
#	common/src/main/java/net/caffeinemc/mods/sodium/client/gui/SodiumGameOptions.java
#	common/src/main/java/net/caffeinemc/mods/sodium/client/render/chunk/ChunkUpdateType.java
#	common/src/main/java/net/caffeinemc/mods/sodium/client/render/chunk/RenderSectionManager.java
#	common/src/main/java/net/caffeinemc/mods/sodium/client/render/chunk/compile/tasks/ChunkBuilderMeshingTask.java
#	common/src/main/java/net/caffeinemc/mods/sodium/client/render/chunk/compile/tasks/ChunkBuilderSortingTask.java
#	common/src/main/java/net/caffeinemc/mods/sodium/client/render/chunk/lists/VisibleChunkCollector.java
#	common/src/main/java/net/caffeinemc/mods/sodium/client/render/chunk/translucent_sorting/data/DynamicBSPData.java
#	common/src/main/java/net/caffeinemc/mods/sodium/client/render/chunk/translucent_sorting/data/DynamicTopoData.java
#	common/src/main/resources/assets/sodium/lang/en_us.json
MoePus pushed a commit to MoePus/sodium that referenced this pull request Nov 10, 2025
MoePus added a commit to MoePus/sodium that referenced this pull request Nov 11, 2025
…ask scheduling (CaffeineMC#2887 merge item 1)

# Conflicts:
#	common/src/main/java/net/caffeinemc/mods/sodium/client/gui/SodiumGameOptionPages.java
#	common/src/main/java/net/caffeinemc/mods/sodium/client/gui/SodiumGameOptions.java
#	common/src/main/java/net/caffeinemc/mods/sodium/client/render/chunk/ChunkUpdateType.java
#	common/src/main/java/net/caffeinemc/mods/sodium/client/render/chunk/RenderSectionManager.java
#	common/src/main/java/net/caffeinemc/mods/sodium/client/render/chunk/compile/tasks/ChunkBuilderMeshingTask.java
#	common/src/main/java/net/caffeinemc/mods/sodium/client/render/chunk/compile/tasks/ChunkBuilderSortingTask.java
#	common/src/main/java/net/caffeinemc/mods/sodium/client/render/chunk/lists/VisibleChunkCollector.java
#	common/src/main/java/net/caffeinemc/mods/sodium/client/render/chunk/translucent_sorting/data/DynamicBSPData.java
#	common/src/main/java/net/caffeinemc/mods/sodium/client/render/chunk/translucent_sorting/data/DynamicTopoData.java
#	common/src/main/resources/assets/sodium/lang/en_us.json
MoePus pushed a commit to MoePus/sodium that referenced this pull request Nov 11, 2025
@douira douira modified the milestones: Sodium 0.7, Upcoming Major Dec 9, 2025
douira added a commit to douira/sodium that referenced this pull request Jan 18, 2026
…eration (CaffeineMC#2887)

This commit includes the remaining merge items needed for full async culling and render list generation. It also takes advantage of the separately introduced upload time limit for task scheduling.

Currently, this regresses dev by not implementing immediate chunk presentation and flawless frames may also be broken as a result.
@douira douira force-pushed the decoupled-frustum-test branch from 53b0418 to 93fd31e Compare January 18, 2026 16:36
douira added a commit to douira/sodium that referenced this pull request Jan 18, 2026
…eration (CaffeineMC#2887)

This commit includes the remaining merge items needed for full async culling and render list generation. It also takes advantage of the separately introduced upload time limit for task scheduling.

Currently, this regresses dev by not implementing immediate chunk presentation and flawless frames may also be broken as a result.
@douira douira force-pushed the decoupled-frustum-test branch from 93fd31e to 446d00b Compare January 18, 2026 20:09
@douira douira marked this pull request as draft January 18, 2026 20:09
douira added a commit to douira/sodium that referenced this pull request Jan 18, 2026
…eration (CaffeineMC#2887)

This commit includes the remaining merge items needed for full async culling and render list generation. It also takes advantage of the separately introduced upload time limit for task scheduling.

Currently, this regresses dev by not implementing immediate chunk presentation and flawless frames may also be broken as a result.
@douira douira force-pushed the decoupled-frustum-test branch from aaa2696 to 1b77fa7 Compare January 18, 2026 22:22
douira added a commit to douira/sodium that referenced this pull request Feb 28, 2026
make ray culling safer but also less effective

fix initialization error when inside a dead end section

fix incremental diamond initialization in combination with angular culling

fix out-of-world rendering to properly disable new angle technique

microoptimization: ensuring capacity on every visit costs ~25ns

track 10-bit quantized min/max angles over 3 planes packed in a long

~60% more cycles during BFS (~0.5ms, 225->355ns per section) in exchange for
5-25% more chunks culled.

attempt to track occlusions using a bitset for angles and a LUT

it's terrible, only culling 3%, but it's fast, only taking 60ns

squeeze a bit more micro-optimization out of intersectSlopes

trim some branches

rewrite slope code to use all integer ops and no divides

add slope-refinement based occlusion culling to cull ~25% more chunks

Co-authored-by: Ryan Hitchman <hitchmanr@gmail.com>

reenable ray occlusion. known issue: in rare cases some sections are culled when they shouldn't be

fix comment

Fix directional visibility calculation when bfs is non-frustum culled or wide

extract angle visibility mask calculation into occlusion visitor to account for expanded non-occlusion zone when reusing the result for the whole section or in wide-bfs mode

Implementation of asynchronous culling and tree-based render list generation (CaffeineMC#2887)

This commit includes the remaining merge items needed for full async culling and render list generation. It also takes advantage of the separately introduced upload time limit for task scheduling.

Currently, this regresses dev by not implementing immediate chunk presentation and flawless frames may also be broken as a result.

remove GraphDirection.OPPOSITE since we can also just do a bitwise XOR on the direction index

propagate symmetry into the number of visibility data sets needed,
reduce stack size to 3*16 since paths can no longer fold back in on themselves

use symmetry to avoid doing half the work in generating visibility data

use bitfield index based neighbor traversal instead of conversion to coordinate triplets

in vis graph construction, skip origin directions that are opposite the allowed step directions since they cannot lead to any visibility

Remove debug code, undo unnecessary changes, delete unused code

Avoid continuing the search if all possible destination faces have already been reached

Don't generate full visibility data arrays if they're just the same everywhere

Fix crash when initializing air chunks and they have no visibility data

Perspective based occlusion culling

Improve accuracy of visibility data by not allowing the traversal to go backwards
@douira douira force-pushed the decoupled-frustum-test branch from c0802ad to c79fa5a Compare February 28, 2026 19:49
make ray culling safer but also less effective

fix initialization error when inside a dead end section

fix incremental diamond initialization in combination with angular culling

fix out-of-world rendering to properly disable new angle technique

microoptimization: ensuring capacity on every visit costs ~25ns

track 10-bit quantized min/max angles over 3 planes packed in a long

~60% more cycles during BFS (~0.5ms, 225->355ns per section) in exchange for
5-25% more chunks culled.

attempt to track occlusions using a bitset for angles and a LUT

it's terrible, only culling 3%, but it's fast, only taking 60ns

squeeze a bit more micro-optimization out of intersectSlopes

trim some branches

rewrite slope code to use all integer ops and no divides

add slope-refinement based occlusion culling to cull ~25% more chunks

Co-authored-by: Ryan Hitchman <hitchmanr@gmail.com>

reenable ray occlusion. known issue: in rare cases some sections are culled when they shouldn't be

fix comment

Fix directional visibility calculation when bfs is non-frustum culled or wide

extract angle visibility mask calculation into occlusion visitor to account for expanded non-occlusion zone when reusing the result for the whole section or in wide-bfs mode

Implementation of asynchronous culling and tree-based render list generation (CaffeineMC#2887)

This commit includes the remaining merge items needed for full async culling and render list generation. It also takes advantage of the separately introduced upload time limit for task scheduling.

Currently, this regresses dev by not implementing immediate chunk presentation and flawless frames may also be broken as a result.

remove GraphDirection.OPPOSITE since we can also just do a bitwise XOR on the direction index

propagate symmetry into the number of visibility data sets needed,
reduce stack size to 3*16 since paths can no longer fold back in on themselves

use symmetry to avoid doing half the work in generating visibility data

use bitfield index based neighbor traversal instead of conversion to coordinate triplets

in vis graph construction, skip origin directions that are opposite the allowed step directions since they cannot lead to any visibility

Remove debug code, undo unnecessary changes, delete unused code

Avoid continuing the search if all possible destination faces have already been reached

Don't generate full visibility data arrays if they're just the same everywhere

Fix crash when initializing air chunks and they have no visibility data

Perspective based occlusion culling

Improve accuracy of visibility data by not allowing the traversal to go backwards
@douira douira force-pushed the decoupled-frustum-test branch from c79fa5a to fc42a69 Compare March 2, 2026 12:48
douira added a commit to douira/sodium that referenced this pull request Mar 24, 2026
make ray culling safer but also less effective

fix initialization error when inside a dead end section

fix incremental diamond initialization in combination with angular culling

fix out-of-world rendering to properly disable new angle technique

microoptimization: ensuring capacity on every visit costs ~25ns

track 10-bit quantized min/max angles over 3 planes packed in a long

~60% more cycles during BFS (~0.5ms, 225->355ns per section) in exchange for
5-25% more chunks culled.

attempt to track occlusions using a bitset for angles and a LUT

it's terrible, only culling 3%, but it's fast, only taking 60ns

squeeze a bit more micro-optimization out of intersectSlopes

trim some branches

rewrite slope code to use all integer ops and no divides

add slope-refinement based occlusion culling to cull ~25% more chunks

Co-authored-by: Ryan Hitchman <hitchmanr@gmail.com>

reenable ray occlusion. known issue: in rare cases some sections are culled when they shouldn't be

fix comment

Fix directional visibility calculation when bfs is non-frustum culled or wide

extract angle visibility mask calculation into occlusion visitor to account for expanded non-occlusion zone when reusing the result for the whole section or in wide-bfs mode

Implementation of asynchronous culling and tree-based render list generation (CaffeineMC#2887)

This commit includes the remaining merge items needed for full async culling and render list generation. It also takes advantage of the separately introduced upload time limit for task scheduling.

Currently, this regresses dev by not implementing immediate chunk presentation and flawless frames may also be broken as a result.

remove GraphDirection.OPPOSITE since we can also just do a bitwise XOR on the direction index

propagate symmetry into the number of visibility data sets needed,
reduce stack size to 3*16 since paths can no longer fold back in on themselves

use symmetry to avoid doing half the work in generating visibility data

use bitfield index based neighbor traversal instead of conversion to coordinate triplets

in vis graph construction, skip origin directions that are opposite the allowed step directions since they cannot lead to any visibility

Remove debug code, undo unnecessary changes, delete unused code

Avoid continuing the search if all possible destination faces have already been reached

Don't generate full visibility data arrays if they're just the same everywhere

Fix crash when initializing air chunks and they have no visibility data

Perspective based occlusion culling

Improve accuracy of visibility data by not allowing the traversal to go backwards
douira added a commit to douira/sodium that referenced this pull request Mar 25, 2026
make ray culling safer but also less effective

fix initialization error when inside a dead end section

fix incremental diamond initialization in combination with angular culling

fix out-of-world rendering to properly disable new angle technique

microoptimization: ensuring capacity on every visit costs ~25ns

track 10-bit quantized min/max angles over 3 planes packed in a long

~60% more cycles during BFS (~0.5ms, 225->355ns per section) in exchange for
5-25% more chunks culled.

attempt to track occlusions using a bitset for angles and a LUT

it's terrible, only culling 3%, but it's fast, only taking 60ns

squeeze a bit more micro-optimization out of intersectSlopes

trim some branches

rewrite slope code to use all integer ops and no divides

add slope-refinement based occlusion culling to cull ~25% more chunks

Co-authored-by: Ryan Hitchman <hitchmanr@gmail.com>

reenable ray occlusion. known issue: in rare cases some sections are culled when they shouldn't be

fix comment

Fix directional visibility calculation when bfs is non-frustum culled or wide

extract angle visibility mask calculation into occlusion visitor to account for expanded non-occlusion zone when reusing the result for the whole section or in wide-bfs mode

Implementation of asynchronous culling and tree-based render list generation (CaffeineMC#2887)

This commit includes the remaining merge items needed for full async culling and render list generation. It also takes advantage of the separately introduced upload time limit for task scheduling.

Currently, this regresses dev by not implementing immediate chunk presentation and flawless frames may also be broken as a result.

remove GraphDirection.OPPOSITE since we can also just do a bitwise XOR on the direction index

propagate symmetry into the number of visibility data sets needed,
reduce stack size to 3*16 since paths can no longer fold back in on themselves

use symmetry to avoid doing half the work in generating visibility data

use bitfield index based neighbor traversal instead of conversion to coordinate triplets

in vis graph construction, skip origin directions that are opposite the allowed step directions since they cannot lead to any visibility

Remove debug code, undo unnecessary changes, delete unused code

Avoid continuing the search if all possible destination faces have already been reached

Don't generate full visibility data arrays if they're just the same everywhere

Fix crash when initializing air chunks and they have no visibility data

Perspective based occlusion culling

Improve accuracy of visibility data by not allowing the traversal to go backwards
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

T-enhancement Type: Enhancement

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants