Skip to content

Comments

perf: reduce db bandwidth — eliminate badge reads + embedding lookup table#441

Open
sethconvex wants to merge 6 commits intoopenclaw:mainfrom
sethconvex:reduce-db-bandwidth
Open

perf: reduce db bandwidth — eliminate badge reads + embedding lookup table#441
sethconvex wants to merge 6 commits intoopenclaw:mainfrom
sethconvex:reduce-db-bandwidth

Conversation

@sethconvex
Copy link
Contributor

@sethconvex sethconvex commented Feb 19, 2026

Summary

Reduces Convex database bandwidth by eliminating redundant reads in the three highest-bandwidth functions:

  • listPublicPageV2 (1.75 TB) — Removed 200 individual skillBadges table queries per page load. Uses the denormalized skill.badges field already on the skill doc. Badge mutations now patch the skill doc directly, keeping it in sync.

  • search.hydrateResults (932 GB) — Added embeddingSkillMap lookup table (~100 bytes/doc) so hydration can resolve embeddingId → skillId without reading full skillEmbeddings docs (~12KB each). Includes a graceful fallback for entries not yet backfilled.

  • search.lexicalFallbackSkills (120 GB) — Removed badge table reads. Reduced FALLBACK_SCAN_LIMIT from 1200 to 500.

  • Badge data consistencyupsertSkillBadge and removeSkillBadge now sync the skill.badges field on the skill doc, keeping it in sync with the skillBadges table going forward.

User-facing impacts

  • Badges: Skills with badges set before this deploy may show stale badge state until the badge backfill runs. Run the badge backfill immediately after deploy.
  • Search coverage: Lexical fallback scans 500 recent skills instead of 1200, so very old skills with no vector embedding are slightly less likely to appear in search results.

Migration steps

1. Deploy

Deploy the code normally. New embeddings and badge changes will automatically use the new paths. Fallbacks ensure everything works during backfills.

2. Run backfills

# Backfill embeddingSkillMap lookup table
npx convex run maintenance:backfillEmbeddingSkillMapInternal --prod

# Backfill denormalized badges (skillBadges table → skill.badges)
npx convex run maintenance:backfillDenormalizedBadgesInternal --prod

Both are self-scheduling paginated mutations that process in batches.

3. Verify

After backfills complete:

  • Search should return correct results (lookup table path)
  • Badges should display correctly on all skills
  • Dashboard bandwidth should drop significantly

Risk

Low. All changes have graceful fallbacks:

  • If embeddingSkillMap row doesn't exist → falls back to reading the full embedding doc
  • Badge backfill syncs table → doc field; mutations keep them in sync going forward
  • removeSkillBadge uses destructuring to properly remove keys (no stale undefined values)

Test plan

  • All 279 convex tests pass
  • TypeScript compiles cleanly (no new errors)
  • Tested search on local dev with no-fallback hydrateResults — works correctly
  • Run both backfills on prod after deploy
  • Monitor prod bandwidth dashboard

🤖 Generated with Claude Code

…encies

The 5-minute stat event processor was patching skill documents on every run,
which invalidated listPublicPageV2 reactive queries for ALL subscribers —
causing a thundering herd responsible for ~17 TB (59%) of the 28.65 TB
monthly db bandwidth.

Split into two paths:
- Daily stats (15-min cron): writes to skillDailyStats only, no skill doc patches
- Skill doc sync (6-hour cron): patches skill documents with accumulated deltas

Also skip reading version docs in listPublicPageV2 and search hydration
(version data is only needed on detail pages, not listings).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@vercel
Copy link
Contributor

vercel bot commented Feb 19, 2026

@sethconvex is attempting to deploy a commit to the Amantus Machina Team on Vercel.

A member of the Team first needs to authorize it.

Skills should never have more than a handful of badge records.
Using .take(10) instead of .collect() avoids unbounded reads.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@sethconvex sethconvex changed the title perf: reduce db bandwidth ~60% by splitting stat processing perf: cap badge query to reduce db reads Feb 19, 2026
- Remove badge table queries from listing and search paths (~200 queries
  per page load eliminated). Use denormalized skill.badges field instead.
- Sync skill.badges when badges are mutated (upsertSkillBadge/removeSkillBadge).
- Add embeddingSkillMap lookup table (~100 bytes/doc) so search hydration
  can skip reading full skillEmbeddings docs (~12KB each with vector).
- Remove dead badge query exports from search module.
- Reduce lexical fallback scan limit from 1200 to 500.
- Add backfill mutation for embeddingSkillMap with graceful fallback.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@sethconvex sethconvex changed the title perf: cap badge query to reduce db reads perf: reduce db bandwidth — eliminate badge reads + embedding lookup table Feb 19, 2026
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@sethconvex sethconvex marked this pull request as ready for review February 19, 2026 22:58
Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

10 files reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

convex/skills.ts Outdated
const skill = await ctx.db.get(skillId)
if (skill) {
const updatedBadges = { ...(skill.badges ?? {}) }
updatedBadges[kind] = undefined
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

setting property to undefined doesn't remove it from the object. use delete operator instead to properly remove the badge field

Suggested change
updatedBadges[kind] = undefined
delete updatedBadges[kind]
Prompt To Fix With AI
This is a comment left during a code review.
Path: convex/skills.ts
Line: 672

Comment:
setting property to `undefined` doesn't remove it from the object. use `delete` operator instead to properly remove the badge field

```suggestion
    delete updatedBadges[kind]
```

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in bfbccbb — used destructuring to properly remove the key: const { [kind]: _, ...remainingBadges } = ...

// Backfill embeddingSkillMap from existing skillEmbeddings.
// Run once after deploying the schema change:
// npx convex run maintenance:backfillEmbeddingSkillMapInternal --prod
export const backfillEmbeddingSkillMapInternal = internalMutation({
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider adding a backfill to populate skill.badges from existing skillBadges table records. currently only new badge changes will sync to skill docs, leaving existing skills without denormalized badges until their badges are modified

Prompt To Fix With AI
This is a comment left during a code review.
Path: convex/maintenance.ts
Line: 1417

Comment:
consider adding a backfill to populate `skill.badges` from existing `skillBadges` table records. currently only new badge changes will sync to skill docs, leaving existing skills without denormalized badges until their badges are modified

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added backfillDenormalizedBadgesInternal in a22521f — reads from skillBadges table and syncs to skill.badges field. Self-scheduling paginated mutation, included in the migration steps.

sethconvex and others added 2 commits February 19, 2026 15:53
- Add backfillDenormalizedBadgesInternal: syncs skillBadges table →
  skill.badges field so listing/search reads are correct
- Simplify hydrateResults fallback

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…Badge

Avoids leaving an explicit undefined key in the badges object which
could fail Convex validation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant