perf: reduce db bandwidth — eliminate badge reads + embedding lookup table#441
perf: reduce db bandwidth — eliminate badge reads + embedding lookup table#441sethconvex wants to merge 6 commits intoopenclaw:mainfrom
Conversation
…encies The 5-minute stat event processor was patching skill documents on every run, which invalidated listPublicPageV2 reactive queries for ALL subscribers — causing a thundering herd responsible for ~17 TB (59%) of the 28.65 TB monthly db bandwidth. Split into two paths: - Daily stats (15-min cron): writes to skillDailyStats only, no skill doc patches - Skill doc sync (6-hour cron): patches skill documents with accumulated deltas Also skip reading version docs in listPublicPageV2 and search hydration (version data is only needed on detail pages, not listings). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
@sethconvex is attempting to deploy a commit to the Amantus Machina Team on Vercel. A member of the Team first needs to authorize it. |
Skills should never have more than a handful of badge records. Using .take(10) instead of .collect() avoids unbounded reads. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Remove badge table queries from listing and search paths (~200 queries per page load eliminated). Use denormalized skill.badges field instead. - Sync skill.badges when badges are mutated (upsertSkillBadge/removeSkillBadge). - Add embeddingSkillMap lookup table (~100 bytes/doc) so search hydration can skip reading full skillEmbeddings docs (~12KB each with vector). - Remove dead badge query exports from search module. - Reduce lexical fallback scan limit from 1200 to 500. - Add backfill mutation for embeddingSkillMap with graceful fallback. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
convex/skills.ts
Outdated
| const skill = await ctx.db.get(skillId) | ||
| if (skill) { | ||
| const updatedBadges = { ...(skill.badges ?? {}) } | ||
| updatedBadges[kind] = undefined |
There was a problem hiding this comment.
setting property to undefined doesn't remove it from the object. use delete operator instead to properly remove the badge field
| updatedBadges[kind] = undefined | |
| delete updatedBadges[kind] |
Prompt To Fix With AI
This is a comment left during a code review.
Path: convex/skills.ts
Line: 672
Comment:
setting property to `undefined` doesn't remove it from the object. use `delete` operator instead to properly remove the badge field
```suggestion
delete updatedBadges[kind]
```
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
Fixed in bfbccbb — used destructuring to properly remove the key: const { [kind]: _, ...remainingBadges } = ...
| // Backfill embeddingSkillMap from existing skillEmbeddings. | ||
| // Run once after deploying the schema change: | ||
| // npx convex run maintenance:backfillEmbeddingSkillMapInternal --prod | ||
| export const backfillEmbeddingSkillMapInternal = internalMutation({ |
There was a problem hiding this comment.
consider adding a backfill to populate skill.badges from existing skillBadges table records. currently only new badge changes will sync to skill docs, leaving existing skills without denormalized badges until their badges are modified
Prompt To Fix With AI
This is a comment left during a code review.
Path: convex/maintenance.ts
Line: 1417
Comment:
consider adding a backfill to populate `skill.badges` from existing `skillBadges` table records. currently only new badge changes will sync to skill docs, leaving existing skills without denormalized badges until their badges are modified
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
Added backfillDenormalizedBadgesInternal in a22521f — reads from skillBadges table and syncs to skill.badges field. Self-scheduling paginated mutation, included in the migration steps.
- Add backfillDenormalizedBadgesInternal: syncs skillBadges table → skill.badges field so listing/search reads are correct - Simplify hydrateResults fallback Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…Badge Avoids leaving an explicit undefined key in the badges object which could fail Convex validation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
bfbccbb to
16f69f5
Compare
Summary
Reduces Convex database bandwidth by eliminating redundant reads in the three highest-bandwidth functions:
listPublicPageV2(1.75 TB) — Removed 200 individualskillBadgestable queries per page load. Uses the denormalizedskill.badgesfield already on the skill doc. Badge mutations now patch the skill doc directly, keeping it in sync.search.hydrateResults(932 GB) — AddedembeddingSkillMaplookup table (~100 bytes/doc) so hydration can resolveembeddingId → skillIdwithout reading fullskillEmbeddingsdocs (~12KB each). Includes a graceful fallback for entries not yet backfilled.search.lexicalFallbackSkills(120 GB) — Removed badge table reads. ReducedFALLBACK_SCAN_LIMITfrom 1200 to 500.Badge data consistency —
upsertSkillBadgeandremoveSkillBadgenow sync theskill.badgesfield on the skill doc, keeping it in sync with theskillBadgestable going forward.User-facing impacts
Migration steps
1. Deploy
Deploy the code normally. New embeddings and badge changes will automatically use the new paths. Fallbacks ensure everything works during backfills.
2. Run backfills
Both are self-scheduling paginated mutations that process in batches.
3. Verify
After backfills complete:
Risk
Low. All changes have graceful fallbacks:
embeddingSkillMaprow doesn't exist → falls back to reading the full embedding docremoveSkillBadgeuses destructuring to properly remove keys (no staleundefinedvalues)Test plan
🤖 Generated with Claude Code