perf: reduce db bandwidth — eliminate badge reads + embedding lookup table by sethconvex · Pull Request #441 · openclaw/clawhub

sethconvex · 2026-02-19T20:21:52Z

Summary

Reduces Convex database bandwidth by eliminating redundant reads in the three highest-bandwidth functions:

listPublicPageV2 (1.75 TB) — Removed 200 individual skillBadges table queries per page load. Uses the denormalized skill.badges field already on the skill doc. Badge mutations now patch the skill doc directly, keeping it in sync.
search.hydrateResults (932 GB) — Added embeddingSkillMap lookup table (~100 bytes/doc) so hydration can resolve embeddingId → skillId without reading full skillEmbeddings docs (~12KB each). Includes a graceful fallback for entries not yet backfilled.
search.lexicalFallbackSkills (120 GB) — Removed badge table reads. Reduced FALLBACK_SCAN_LIMIT from 1200 to 500.
Badge data consistency — upsertSkillBadge and removeSkillBadge now sync the skill.badges field on the skill doc, keeping it in sync with the skillBadges table going forward.

User-facing impacts

Badges: Skills with badges set before this deploy may show stale badge state until the badge backfill runs. Run the badge backfill immediately after deploy.
Search coverage: Lexical fallback scans 500 recent skills instead of 1200, so very old skills with no vector embedding are slightly less likely to appear in search results.

Migration steps

1. Deploy

Deploy the code normally. New embeddings and badge changes will automatically use the new paths. Fallbacks ensure everything works during backfills.

2. Run backfills

# Backfill embeddingSkillMap lookup table
npx convex run maintenance:backfillEmbeddingSkillMapInternal --prod

# Backfill denormalized badges (skillBadges table → skill.badges)
npx convex run maintenance:backfillDenormalizedBadgesInternal --prod

Both are self-scheduling paginated mutations that process in batches.

3. Verify

After backfills complete:

Search should return correct results (lookup table path)
Badges should display correctly on all skills
Dashboard bandwidth should drop significantly

Risk

Low. All changes have graceful fallbacks:

If embeddingSkillMap row doesn't exist → falls back to reading the full embedding doc
Badge backfill syncs table → doc field; mutations keep them in sync going forward
removeSkillBadge uses destructuring to properly remove keys (no stale undefined values)

Test plan

All 279 convex tests pass
TypeScript compiles cleanly (no new errors)
Tested search on local dev with no-fallback hydrateResults — works correctly
Run both backfills on prod after deploy
Monitor prod bandwidth dashboard

🤖 Generated with Claude Code

…encies The 5-minute stat event processor was patching skill documents on every run, which invalidated listPublicPageV2 reactive queries for ALL subscribers — causing a thundering herd responsible for ~17 TB (59%) of the 28.65 TB monthly db bandwidth. Split into two paths: - Daily stats (15-min cron): writes to skillDailyStats only, no skill doc patches - Skill doc sync (6-hour cron): patches skill documents with accumulated deltas Also skip reading version docs in listPublicPageV2 and search hydration (version data is only needed on detail pages, not listings). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

vercel · 2026-02-19T20:21:57Z

@sethconvex is attempting to deploy a commit to the Amantus Machina Team on Vercel.

A member of the Team first needs to authorize it.

Skills should never have more than a handful of badge records. Using .take(10) instead of .collect() avoids unbounded reads. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Remove badge table queries from listing and search paths (~200 queries per page load eliminated). Use denormalized skill.badges field instead. - Sync skill.badges when badges are mutated (upsertSkillBadge/removeSkillBadge). - Add embeddingSkillMap lookup table (~100 bytes/doc) so search hydration can skip reading full skillEmbeddings docs (~12KB each with vector). - Remove dead badge query exports from search module. - Reduce lexical fallback scan limit from 1200 to 500. - Add backfill mutation for embeddingSkillMap with graceful fallback. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

greptile-apps

_{10 files reviewed, 2 comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-02-19T23:02:16Z

convex/skills.ts

+  const skill = await ctx.db.get(skillId)
+  if (skill) {
+    const updatedBadges = { ...(skill.badges ?? {}) }
+    updatedBadges[kind] = undefined


setting property to undefined doesn't remove it from the object. use delete operator instead to properly remove the badge field

Suggested change

updatedBadges[kind] = undefined

delete updatedBadges[kind]

Prompt To Fix With AI

This is a comment left during a code review. Path: convex/skills.ts Line: 672 Comment: setting property to `undefined` doesn't remove it from the object. use `delete` operator instead to properly remove the badge field ```suggestion delete updatedBadges[kind] ``` How can I resolve this? If you propose a fix, please make it concise.

Fixed in bfbccbb — used destructuring to properly remove the key: const { [kind]: _, ...remainingBadges } = ...

greptile-apps · 2026-02-19T23:02:17Z

convex/maintenance.ts

+// Backfill embeddingSkillMap from existing skillEmbeddings.
+// Run once after deploying the schema change:
+//   npx convex run maintenance:backfillEmbeddingSkillMapInternal --prod
+export const backfillEmbeddingSkillMapInternal = internalMutation({


consider adding a backfill to populate skill.badges from existing skillBadges table records. currently only new badge changes will sync to skill docs, leaving existing skills without denormalized badges until their badges are modified

Prompt To Fix With AI

This is a comment left during a code review. Path: convex/maintenance.ts Line: 1417 Comment: consider adding a backfill to populate `skill.badges` from existing `skillBadges` table records. currently only new badge changes will sync to skill docs, leaving existing skills without denormalized badges until their badges are modified How can I resolve this? If you propose a fix, please make it concise.

Added backfillDenormalizedBadgesInternal in a22521f — reads from skillBadges table and syncs to skill.badges field. Self-scheduling paginated mutation, included in the migration steps.

- Add backfillDenormalizedBadgesInternal: syncs skillBadges table → skill.badges field so listing/search reads are correct - Simplify hydrateResults fallback Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…Badge Avoids leaving an explicit undefined key in the badges object which could fail Convex validation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

perf: cap badge query to 10 records per skill

07179d0

Skills should never have more than a handful of badge records. Using .take(10) instead of .collect() avoids unbounded reads. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

sethconvex changed the title ~~perf: reduce db bandwidth ~60% by splitting stat processing~~ perf: cap badge query to reduce db reads Feb 19, 2026

sethconvex changed the title ~~perf: cap badge query to reduce db reads~~ perf: reduce db bandwidth — eliminate badge reads + embedding lookup table Feb 19, 2026

fix: restore fallback in hydrateResults for un-backfilled embeddings

0da8dfa

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

sethconvex marked this pull request as ready for review February 19, 2026 22:58

greptile-apps bot reviewed Feb 19, 2026

View reviewed changes

sethconvex and others added 2 commits February 19, 2026 15:53

feat: add backfill for denormalized skill badges, clean up fallback

a22521f

- Add backfillDenormalizedBadgesInternal: syncs skillBadges table → skill.badges field so listing/search reads are correct - Simplify hydrateResults fallback Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix: use destructuring instead of undefined assignment in removeSkill…

16f69f5

…Badge Avoids leaving an explicit undefined key in the badges object which could fail Convex validation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

sethconvex force-pushed the reduce-db-bandwidth branch from bfbccbb to 16f69f5 Compare February 20, 2026 06:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Comments

perf: reduce db bandwidth — eliminate badge reads + embedding lookup table#441

perf: reduce db bandwidth — eliminate badge reads + embedding lookup table#441
sethconvex wants to merge 6 commits intoopenclaw:mainfrom
sethconvex:reduce-db-bandwidth

sethconvex commented Feb 19, 2026 •

edited

Loading

Uh oh!

vercel bot commented Feb 19, 2026

Uh oh!

greptile-apps bot left a comment

Uh oh!

greptile-apps bot Feb 19, 2026

Uh oh!

sethconvex Feb 20, 2026

Uh oh!

greptile-apps bot Feb 19, 2026

Uh oh!

sethconvex Feb 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Comments

Conversation

sethconvex commented Feb 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

User-facing impacts

Migration steps

1. Deploy

2. Run backfills

3. Verify

Risk

Test plan

Uh oh!

vercel bot commented Feb 19, 2026

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

sethconvex Feb 20, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

sethconvex Feb 20, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

sethconvex commented Feb 19, 2026 •

edited

Loading