Add deep AI search to rerank all components in selected sources by Mbeaulne · Pull Request #2430 · TangleML/tangle-ui

Mbeaulne · 2026-06-18T17:55:58Z

Description

Adds a Deep AI search button alongside the existing AI (smart) search button. While the standard AI search sends a limited, curated set of candidate components to the reranker, Deep AI search sends the entire searchable index — lexical hits first so truncating providers still see the most likely matches early — allowing the model to rerank across every available component in the selected sources.

Key changes:

Introduces buildDeepAiCandidateMatches, which builds a candidate pool from all indexed components (no cap), ordered with lexical matches first and remaining components appended alphabetically.
Exposes canDeepRerank and deepRerank from useComponentSearchV2State and wires them into both the Dashboard and Editor search UIs.
Enriches RerankCandidate with richer I/O metadata (type and description, not just names) and a source field, giving the model more signal when ranking.
Refactors the shared startAiSearch / startRerank helpers to accept an arbitrary candidate list, removing duplication between the standard and deep paths.
Updates componentReferenceToCandidate to accept an optional source argument and serialize I/O types (including complex object types) via JSON.stringify.

Related Issue and Pull requests

Type of Change

Checklist

I have tested this does not break current pipelines / runs functionality
I have tested the changes on staging

Screenshots (if applicable)

Test Instructions

Open the component search panel (Dashboard or Editor).
Enter a non-empty query.
Confirm the Deep AI search button appears and is enabled when an AI provider is configured.
Click Deep AI search and verify that results are reranked across all components in the selected sources, not just the top lexical candidates.
Confirm the existing AI search (sparkles) button still behaves as before.
Verify the button is disabled when the query is empty or no AI provider is configured.

Additional Comments

scoreAllCandidates is intentionally set to false for deep rerank to avoid scoring overhead across the full index on every result row. The standard AI search retains scoreAllCandidates: true so relevance percentages continue to appear on displayed results.

github-actions · 2026-06-18T17:56:08Z

🎩 Preview

A preview build has been created at: 06-18-improve_ai_rerank_payload_for_component_search/817441b

Mbeaulne · 2026-06-18T17:56:23Z

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more

This stack of pull requests is managed by Graphite. Learn more about stacking.

Mbeaulne · 2026-06-18T18:56:58Z

+    candidates,
+    seenDigests,
+    lexicalSearch(index, trimmedQuery, {
+      limit: index.length,


🤖 This is an AI-generated code review comment.

[HIGH] Deep search builds an unbounded candidate pool: lexicalSearch(index, ..., { limit: index.length }) followed by appending every remaining entry with Number.MAX_SAFE_INTEGER, then JSON.stringify-ing each candidate into a billed LLM rerank prompt. Output is bounded (scoreAllCandidates: false → ≤20) but input is not — there is no cap, truncation, or confirmation. Cap the deep pool at a high-but-finite N (and/or surface a confirmation when the pool is very large). At minimum, make Number.MAX_SAFE_INTEGER a deliberate, documented bound rather than effectively unlimited.

Mbeaulne · 2026-06-18T18:57:34Z

+    const candidates: LexicalMatch[] = [];
+    const seenDigests = new Set<string>();
+    const allLexicalMatches = lexicalSearch(filteredIndex, query, {
+      limit: filteredIndex.length,


🤖 This is an AI-generated code review comment.

[HIGH] Same unbounded-deep-pool concern on the Dashboard surface: lexicalSearch(filteredIndex, ..., { limit: filteredIndex.length }) then every remaining sortedIndex entry is appended, and the whole pool is JSON.stringify-ed into a billed LLM rerank with no input cap or confirmation. Bound the deep pool to a finite N or confirm before sending a very large pool; document the worst-case prompt size.

Mbeaulne · 2026-06-18T18:57:35Z

            </Button>
+            <Button
+              variant="outline"
+              onClick={handleDeepAiSearch}


🤖 This is an AI-generated code review comment.

[MEDIUM] This "Deep AI search" Button gives no busy/in-progress feedback while isReranking, unlike the sibling Sparkles button which sets a dynamic aria-label. Add e.g. aria-label={isReranking ? "Deep AI search in progress" : "Deep AI search"} (and consider a busy indicator) so the disabled-while-reranking state is announced to assistive tech.

Mbeaulne · 2026-06-18T18:57:36Z

  };

-  const handleSmartSearch = () => {
+  const startAiSearch = (matches: LexicalMatch[]) => {


🤖 This is an AI-generated code review comment.

[LOW] startAiSearch passes no scoreAllCandidates for either smart or deep, so both default to false here, whereas the Editor uses true for smart / false for deep. Routing the new deep button through this shared helper cements a smart/deep behavior divergence between the two surfaces. Consider passing a flag (as the Editor’s startRerank does) for parity.

Mbeaulne · 2026-06-18T18:57:37Z

  const trimmedQuery = query.trim();
  const lexicalMatches = buildLexicalMatches(index, trimmedQuery);
  const aiCandidateMatches = buildAiCandidateMatches(index, trimmedQuery);
+  const deepAiCandidateMatches = buildDeepAiCandidateMatches(


🤖 This is an AI-generated code review comment.

[LOW] buildDeepAiCandidateMatches (full-index lexicalSearch + full [...index].sort()) runs every render though it is only needed at click time. React Compiler memoizes this, so it is not a correctness bug. Optionally, compute only .length/enabled-state during render and build the ordered pool lazily inside the deep handler.

Mbeaulne · 2026-06-18T18:57:45Z

🤖 This is an AI-generated code review comment.

[MEDIUM] src/hooks/useNaturalLanguageComponentSearch.ts ~lines 38-50 — The rerank mutation passes no AbortSignal, so a now-much-heavier deep rerank cannot be cancelled if the user retypes. Pre-existing, but materially amplified by this feature. Follow-up: thread signal into the mutation so a new query aborts the in-flight deep call.

(Posted as a PR-level comment because this file has no changed lines in the diff to anchor an inline comment to.)

camielvs · 2026-06-19T22:13:23Z

🤖 Code review — Add deep AI search to rerank all components in selected sources

This is the most consequential PR in the stack — it adds a "Deep AI" button that reranks the entire library, plus a richer candidate payload (per-IO name/type/description + source). The startRerank/startAiSearch refactor with the scoreAllCandidates flag is clean, the richer payload should genuinely help model judgment, and the deep-pool test (lexical hits first, then the rest) pins the ordering. But a few things need attention before this ships.

Main concerns

Deep search sends an unbounded candidate set to the LLM in one request. buildDeepAiCandidateMatches returns every searchable component (limit: index.length, appended with Number.MAX_SAFE_INTEGER), and the new per-IO payload (name + type + description) inflates each candidate vs the old name-only arrays. scoreAllCandidates: false bounds the output (1500 tokens / top-20) but the input is uncapped. For a few hundred components this is a large-but-survivable prompt; for a big registered library it risks blowing the model/provider context window — which surfaces as a hard error or, worse, silent truncation. The "lexical hits first so truncating providers see likely matches early" comment helps ordering but doesn't prevent the failure. Recommend a hard candidate cap (or a token-budget estimate with chunked passes), and deciding what the UX does when the library exceeds it. Right now nothing communicates a ceiling.
Post-deep-rerank the full reordered library is rendered with no virtualization. After a deep rerank, displayedMatches/displayedResults becomes every component reordered (Editor rerankedMatches, Dashboard mergeRerankIntoLexical over rerankBaseMatches = deep pool). ComponentSearchResults does a plain results.map(...) — no windowing, no cap — and the header prints Search Results ({results.length}). Deep search on a large library renders hundreds/thousands of DOM rows. Confirm the list is virtualized or cap the displayed set.
The two search surfaces have diverged into parallel implementations. Deep-pool construction is duplicated: the Editor uses buildDeepAiCandidateMatches from componentSearchV2Logic, while DashboardComponentsV2View reimplements it inline as an IIFE. This compounds existing divergence — the Dashboard's aiCandidateMatches is the older simple broad || sampleEvenly form and never received Add source-diverse AI rerank candidate pool #2429's source-diversity tiering, and the two use different merge helpers (rerankedMatches vs mergeRerankIntoLexical). These will drift in behavior and bugs. Strongly consider extracting the shared selection/merge logic so both surfaces stay consistent.
scoreAllCandidates differs between surfaces. In the Editor, smart = true (every result badged with a relevance %), deep = false. In the Dashboard, both smart and deep go through startAiSearch → rerank({ query, candidates }) with no scoreAllCandidates at all (so it defaults false). So Dashboard smart search behaves differently from Editor smart search. If that's intentional, a comment would help; if not, align them.

Minor

source.label (user-controlled for registered libraries) now flows into the candidate payload. It's inside the <candidates> block already framed as untrusted in the system prompt, so this is consistent with the existing IO/description fields — just noting the new field is equally attacker-influenceable and relies on that same framing.
The deep pool sorts the whole index alphabetically ([...index].sort(...)) on every render the button is enabled; with React Compiler memoization this is probably fine, but it's O(n log n) per render for a list that only matters on click — could be lazily computed in the handler.

Solid direction and a genuinely useful feature; the input-size ceiling and the surface duplication are the two I'd want resolved before merge.

Mbeaulne mentioned this pull request Jun 18, 2026

Add source-diverse AI rerank candidate pool #2429

Open

8 tasks

Mbeaulne changed the title ~~Improve AI rerank payload for component search~~ Add deep AI search to rerank all components in selected sources Jun 18, 2026

Mbeaulne mentioned this pull request Jun 18, 2026

Add search quality expectation tests for lexical search #2431

Open

8 tasks

Mbeaulne marked this pull request as ready for review June 18, 2026 17:59

Mbeaulne requested a review from a team as a code owner June 18, 2026 17:59

This was referenced Jun 18, 2026

add client-side embeddings cached in IndexedDB #2432

Open

Debounce component search input #2433

Open

Mbeaulne commented Jun 18, 2026

View reviewed changes

Mbeaulne force-pushed the 06-18-improve_ai_rerank_payload_for_component_search branch from 7d30372 to c443c7a Compare June 18, 2026 19:12

Mbeaulne force-pushed the 06-18-build_broader_ai_candidate_pools_for_component_search branch from d363ca7 to 60b076d Compare June 18, 2026 19:12

Mbeaulne force-pushed the 06-18-improve_ai_rerank_payload_for_component_search branch from c443c7a to 9fdd3d5 Compare June 18, 2026 20:28

Mbeaulne force-pushed the 06-18-build_broader_ai_candidate_pools_for_component_search branch from 60b076d to 455266e Compare June 18, 2026 20:28

Mbeaulne force-pushed the 06-18-improve_ai_rerank_payload_for_component_search branch from 9fdd3d5 to d9e254e Compare June 18, 2026 20:49

Mbeaulne force-pushed the 06-18-build_broader_ai_candidate_pools_for_component_search branch 2 times, most recently from 4a246ee to 8cc6222 Compare June 18, 2026 21:02

Mbeaulne force-pushed the 06-18-improve_ai_rerank_payload_for_component_search branch from d9e254e to 1351eea Compare June 18, 2026 21:02

Improve AI rerank payload for component search

817441b

Mbeaulne force-pushed the 06-18-build_broader_ai_candidate_pools_for_component_search branch from 8cc6222 to 88f3546 Compare June 18, 2026 21:16

Mbeaulne force-pushed the 06-18-improve_ai_rerank_payload_for_component_search branch from 1351eea to 817441b Compare June 18, 2026 21:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add deep AI search to rerank all components in selected sources#2430

Add deep AI search to rerank all components in selected sources#2430
Mbeaulne wants to merge 1 commit into
06-18-build_broader_ai_candidate_pools_for_component_searchfrom
06-18-improve_ai_rerank_payload_for_component_search

Mbeaulne commented Jun 18, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 18, 2026 •

edited

Loading

Uh oh!

Mbeaulne commented Jun 18, 2026 •

edited

Loading

Uh oh!

Mbeaulne Jun 18, 2026

Uh oh!

Mbeaulne Jun 18, 2026

Uh oh!

Mbeaulne Jun 18, 2026

Uh oh!

Mbeaulne Jun 18, 2026

Uh oh!

Mbeaulne Jun 18, 2026

Uh oh!

Mbeaulne commented Jun 18, 2026

Uh oh!

camielvs commented Jun 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Mbeaulne commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related Issue and Pull requests

Type of Change

Checklist

Screenshots (if applicable)

Test Instructions

Additional Comments

Uh oh!

github-actions Bot commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🎩 Preview

Uh oh!

Mbeaulne commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Mbeaulne Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

Mbeaulne Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

Mbeaulne Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

Mbeaulne Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

Mbeaulne Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

Mbeaulne commented Jun 18, 2026

Uh oh!

camielvs commented Jun 19, 2026

🤖 Code review — Add deep AI search to rerank all components in selected sources

Main concerns

Minor

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Mbeaulne commented Jun 18, 2026 •

edited

Loading

github-actions Bot commented Jun 18, 2026 •

edited

Loading

Mbeaulne commented Jun 18, 2026 •

edited

Loading