Replies: 1 comment
-
|
Appreciate the discussion here — the agent angle stood out as especially practical. A small pattern that may help:
If useful, we keep a structured index of similar workflows here: https://skillslookup.com Happy to share more detail if it would help. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I appreciate the more LogSeq native chunking by top-level blocks. I think this utilizes the LogSeq's logical note structure for a good use.
I think there could be even a further upgrade. Just putting some thought now.
Size aware chunking
If there are many small top-level blocks, they can't easily fall out of context being alone. Define an OPTIMAL_CHUNK_SIZE (or reuse MIN_CHUNK_SIZE for simplicity). If a top-level block does not fullfill that, recurse swallowing adjacent too.
Define a MAX_CHUNK_SIZE. If a top-level block is really large, split its downstream content up into chunks that also aim for OPTIMAL_CHUNK_SIZE. Now even more interesting part is that to avoid loosing the context of where this block belongs, include a breadcrumb at the top of the chunk indicating its origin. May even include a smart elipsis to indicate that at this level there are x more nodes. Keep it recursive.
I said this very simply, but I know this requires a somewhat sophisticated tree algo to get the optimal chunks.
Below is an example
Example parameters. Not actually calculated, just illustrative. Small for brevity of explanation. Normally, would be bigger.
Original File
Chunks
Chunk 1 - merged small top-level blocks
Chunk 2 - large top-level block split, first subtree
Chunk 3
Chunk 4
Chunk 5 - tiny trailing top-level block (best-fit, can't do optimal, but meets the minimum)
The added tokens by breadcrumbs can be concerning, esp. when sizes are tiny, this can inflate greatly. Reason to make them optional
But also they could be shorter
(... 2 omitted sibling blocks before)->(+2), with an explanation to the Agent at the MCP level(+N) inside the retrieved data indicate ommited sibling blocks for contextWith this breakcrumb + sibling data, agent can then utilize the
get_page_content,searchorqueryto selectively retrieve the full context as it now sees the hierarchy.What's your view on this? I may give it a shot myself with a fork.
Beta Was this translation helpful? Give feedback.
All reactions