Skip to content

Add public discovery engine foundation#98

Open
BASIC-BIT wants to merge 3 commits into
mainfrom
feat/public-discovery-engine
Open

Add public discovery engine foundation#98
BASIC-BIT wants to merge 3 commits into
mainfrom
feat/public-discovery-engine

Conversation

@BASIC-BIT
Copy link
Copy Markdown
Owner

Summary

  • Adds the public discovery/search foundation across profiles, worlds, and events with universal search documents, vocabulary terms, featured placements, and suppression/opt-out state.
  • Adds /discover, /search, and /privacy/suppression web surfaces plus a search-first home hero and optional PostHog analytics hooks.
  • Documents search, vocabulary, surfacing, suppression, and analytics contracts with backend tests and Playwright fixture coverage.

Testing

  • pnpm verify
  • pnpm test:e2e
  • pnpm test:e2e:visual

Risk Notes

  • PostHog is no-op unless NEXT_PUBLIC_POSTHOG_KEY is configured.
  • Vector search provider remains behind the provider-neutral searchEmbeddings seam.
  • Suppression intake is request-only until full auth/claim review workflows exist.

Closes #25
Closes #26
Closes #28
Closes #29
Closes #30
Closes #31
Closes #32
Closes #33
Closes #90
Closes #95
Closes #96
Closes #97

@vercel
Copy link
Copy Markdown

vercel Bot commented May 27, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
vr-dex-web Ready Ready Preview, Comment May 27, 2026 12:53pm

Request Review

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 27, 2026

Playwright Public Screenshot Preview

Outcome: success
Run: https://github.com/BASIC-BIT/VRDex/actions/runs/26512181756
Artifact: playwright-public-preview

Captured routes:

  • /
  • /submit
  • /server-status
  • /deployment
  • /p/playwright-dj-aurora
  • /c/playwright-afterglow-social

This job is required for PR checks; pixel diff baselines are not enabled yet.

@BASIC-BIT BASIC-BIT marked this pull request as ready for review May 27, 2026 12:22
@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, add credits to your account and enable them for code reviews in your settings.

@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented May 27, 2026

Greptile Summary

This PR adds the public discovery engine foundation: universal search documents for profiles, worlds, and events; a vocabulary/facets system; featured placements; suppression/opt-out intake; and the /discover, /search, and /privacy/suppression web surfaces with optional PostHog analytics hooks.

  • Search document layer (convex/_searchDocuments.ts, convex/schema.ts): new searchDocuments and related tables, with indexing wired into profile/event/world mutations; listDiscovery and searchUniversal queries serve the front-end.
  • Vocabulary system (convex/_vocabulary.ts, convex/search.ts): seeded and user-created vocabulary terms with upsert logic and a seedVocabulary mutation that currently lacks an auth guard.
  • Suppression intake (convex/suppressions.ts, suppression form): request-only workflow (no automatic enforcement), auth identity captured when present but not required by design.
  • Frontend surfaces (discovery-public-page.tsx, PostHogProvider.tsx, discovery-analytics.tsx): discovery/search pages with PostHog no-op analytics and Playwright fixture support for visual review.

Confidence Score: 3/5

Merging now would ship a discovery page that silently stops showing upcoming events once the platform accumulates more than ~40 past events, and exposes an unguarded admin mutation.

The ascending index order in listDocumentsByType is the most impactful issue: as soon as 40+ past events exist in production, listDiscovery fetches no upcoming events and the featured section fills entirely with stale past events. The seedVocabulary mutation being callable without authentication allows data corruption of vocabulary usage counts. Both issues are in core data-access paths of the feature being shipped.

convex/search.ts needs the index ordering fix and the auth guard on seedVocabulary; convex/_searchDocuments.ts has a dead import and an incomplete trustLabel implementation worth resolving before the field is expected by the UI.

Security Review

  • Unauthenticated admin mutation (convex/search.ts seedVocabulary): the mutation is callable by any unauthenticated client. Repeated calls increment usageCount for all seeded vocabulary terms, polluting usage data that is surfaced to end users via the discovery terms list.

Important Files Changed

Filename Overview
convex/search.ts New search and discovery queries. listDocumentsByType uses ascending index order which will silently exclude upcoming events once past-event count exceeds 40. seedVocabulary mutation lacks any authentication check.
convex/_searchDocuments.ts Core search document builder. getProfileTrustLabel is imported but never called; trustLabel is declared in PublicSearchResult but toPublicSearchResult never sets it. Stale featuredRank on past events inflates scores in universal search.
convex/suppressions.ts New suppression intake mutation. Auth identity is optional by design (unauthenticated requests allowed), with slug/display-name validation guarding the DB write. Logic looks correct for the stated request-only model.
convex/schema.ts Adds profileSuppressionRequests, vocabularyTerms, searchDocuments, searchEmbeddings, and featuredPlacements tables with correct indexes. Schema looks well-structured.
convex/_vocabulary.ts Vocabulary term management with upsert logic. Seeded term labels are preserved on update. Logic is correct though usageCount grows unbounded with every index write.
convex/profiles.ts Adds search document and vocabulary upsert calls after community profile submission. Correct auth guard remains.
convex/events.ts Adds search document and vocabulary upsert calls after event create/update. Correctly passes community, world, and role context to the search document builder.
apps/web/src/app/PostHogProvider.tsx PostHog analytics provider. No-op without NEXT_PUBLIC_POSTHOG_KEY; correctly opts out capturing in non-production environments.
apps/web/src/convex/playwright-fixtures.ts Deterministic fixture data for Playwright visual review. Correctly gated by NODE_ENV !== production and VRDEX_ENABLE_PLAYWRIGHT_FIXTURES env var.
apps/web/src/app/privacy/suppression/suppression-request-form.tsx Client-side suppression request form. Form casts for requestType and profileType rely on fixed select options; server-side Convex schema validates the union types. No issues.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    subgraph Mutations
        CP[submitCommunityProfile] --> USD1[upsertSearchDocument]
        CE[createCommunityEvent] --> USD2[upsertSearchDocument]
        UE[updateCommunityEvent] --> USD3[upsertSearchDocument]
        SV[seedVocabulary no auth check] --> RVT[recordVocabularyTerms]
    end

    subgraph SearchDocuments
        USD1 & USD2 & USD3 --> SD[(searchDocuments publicState / entityType / featuredRank)]
    end

    subgraph Queries
        SD -->|by_publicState_entityType_featuredRank ascending order| LDT[listDocumentsByType x 40]
        LDT --> SRR[sortSearchResults]
        SRR --> LD[listDiscovery featured / upcomingEvents / people / worlds]

        SD -->|search_text index| SU[searchUniversal]
        SU --> TPSR[toPublicSearchResult freshnessBoost at query time featuredRank stale for past events]
    end

    subgraph Frontend
        LD --> DPP[/discover page/]
        TPSR --> SP[/search page/]
    end
Loading
Prompt To Fix All With AI
Fix the following 4 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 4
convex/search.ts:55-62
**Ascending index order drops upcoming events once past-event count exceeds 40**

`listDocumentsByType` queries `by_publicState_entityType_featuredRank` without `.order("desc")`, so Convex returns the 40 documents with the **lowest** `featuredRank` first. Past events have `featuredRank: 8` and upcoming events have `featuredRank: 42`. Once more than 40 past events exist, every `.take(40)` call fills entirely with past events — `listDiscovery`'s `upcomingEvents` and `featured` sections will silently return nothing from new upcoming events because those documents are never fetched. The downstream `sortSearchResults` re-rank cannot help if the candidates were never retrieved.

```suggestion
async function listDocumentsByType(ctx: QueryCtx, entityType: SearchEntityType) {
  return await ctx.db
    .query("searchDocuments")
    .withIndex("by_publicState_entityType_featuredRank", (index) =>
      index.eq("publicState", "public").eq("entityType", entityType),
    )
    .order("desc")
    .take(40);
}
```

### Issue 2 of 4
convex/search.ts:106-124
**`seedVocabulary` mutation is callable by anyone, including unauthenticated users**

There is no `ctx.auth.getUserIdentity()` check in the handler. In Convex, any public mutation is callable directly by any client. Repeated calls to `seedVocabulary` drive up `usageCount` for all 22 seeded vocabulary terms on every invocation (via `recordVocabularyTerms`), and `usageCount` is exposed to the front-end as a signal in discovery term lists. A bot looping this call would corrupt the vocabulary usage data before any admin has a chance to review submissions.

```suggestion
export const seedVocabulary = mutation({
  args: {},
  handler: async (ctx) => {
    const identity = await ctx.auth.getUserIdentity();

    if (identity === null) {
      throw new Error("Seeding vocabulary requires a signed-in user.");
    }

    const now = Date.now();

    await recordVocabularyTerms(
```

### Issue 3 of 4
convex/_searchDocuments.ts:285-321
**`trustLabel` declared in `PublicSearchResult` but never populated; `getProfileTrustLabel` import is dead code**

`PublicSearchResult` (line 23) declares `trustLabel?:` and `getProfileTrustLabel` is imported at line 5, but `toPublicSearchResult` never reads `document.claimState` or calls `getProfileTrustLabel` — the field is always `undefined` in every API response. Any front-end card logic that branches on `trustLabel` will silently fall through to the "unclaimed" or empty-state path for all search hits, even verified profiles.

### Issue 4 of 4
convex/_searchDocuments.ts:239-283
**Stale `featuredRank` on past events inflates their scores in `searchUniversal`**

`featuredRank` is set to `42` for upcoming events at index time (`isUpcoming ? 42 : 8`). Once a past event's search document is not re-indexed (which only happens on explicit event update), the document retains `featuredRank: 42`. In `toPublicSearchResult`, `freshnessBoost` is correctly computed at query time (returns 0 for past events), but `document.featuredRank` is read from the stored stale value. A stale past event scores ≈ 90 (`trustRank: 30 + featuredRank: 42 + typeWeight: 18`) — outranking a freshly indexed world (≈ 64). This particularly affects `searchUniversal` where all public documents are candidates.

Reviews (1): Last reviewed commit: "Add public discovery engine foundation" | Re-trigger Greptile

Comment thread convex/search.ts
Comment thread convex/search.ts Outdated
Comment thread convex/_searchDocuments.ts
Comment thread convex/_searchDocuments.ts
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment