Context
Companion issue to #2147 (CLAUDE.md). CLAUDE.md documents reviewer-enforced conventions so contributors (human and AI-assisted) can self-correct before opening a PR. This issue proposes programmatic enforcement so contributors don't have to read a doc to get the PR right — CI blocks the PR or the bot comments the fix directly.
CircleCI today runs npm run lint, mocha test, npm run xed-validation, and npm run incompatibility-check. It does not run npm run validate in CI, and most of the repeat reviewer feedback patterns below are not enforced anywhere.
Proposed checks
Ordered roughly by value-to-effort ratio. Each item is independently shippable.
1. Run npm run validate in CI
bin/validate-schemas.js catches example files that reference fields not defined in the schema, enum values not in the enum array, type mismatches, etc. Running it in CI (gated on "zero new failures vs master baseline" — the master baseline has ~58 pre-existing unrelated failures) would catch the single most common PR regression.
Effort: add - run: npm run validate to .circleci/config.yml. The script already exits non-zero on failure. Baseline-diff logic can be added as a follow-up.
2. meta:enum coverage check
Lint rule: for every schema under components/ / schemas/ / extensions/, every value listed in an enum array must appear as a key in the sibling meta:enum object. Missing keys fail CI with a file-and-path pointer.
Covers the #1 reviewer feedback ("forgot the meta:enum key").
Effort: small standalone Node script, callable from npm scripts. ~50 LOC with AJV's schema walker.
3. Example-schema field-drift check
Stricter version of npm test: for each *.example.*.json, assert every xdm:-prefixed property path resolves to a definition somewhere in the merged schema (including inherited allOf chains). Today this is caught implicitly when the schema uses additionalProperties: false; many XDM schemas don't set that, so invented fields in examples pass mocha.
Effort: medium. Reuse bin/validate-schemas.js plumbing; add strict-mode wrapper.
4. meta:status presence check
Every new schema (added in the PR diff) must set meta:status. Today this is reviewer-enforced; a git diff --name-only upstream/master...HEAD | grep '.schema.json$' | xargs jq ... one-liner in CI would flag omissions.
Effort: small. Pure shell + jq in CI.
5. PR body / metadata checks (GitHub Action)
An Action that, on PR open / synchronize:
- Requires
Closes #<n> or Fixes #<n> in the body
- Requires the
needs review label before a reviewer is auto-requested
- Detects rename/remove of existing properties in a schema diff and posts a "consider deprecating instead" comment with a link to the deprecation pattern in CLAUDE.md
- Detects conversion of a plain
string property to an enum-constrained property and posts a "consider using meta:enum on a soft enum instead" comment
Effort: medium, one reusable workflow.
6. Auto-fix bot (stretch)
For the subset of issues that have a deterministic fix (missing meta:enum keys for values that exist in enum, xdm: prefix missing on a property that the schema defines with the prefix, prettier reformatting), a GitHub Action bot that opens a fix-commit PR branch or leaves suggestions.
Effort: large; only worth doing after 1–5 are in place and we have usage data on the most common violations.
7. Reuse-over-reinvent hint (stretch)
Harder to automate, but a suggest-existing-datatypes.js helper that takes a new datatype file and surfaces the top-N semantically similar existing datatypes (by title + property-name Jaccard similarity) would save design-review round-trips. Could run as a PR comment, not a blocking check.
Effort: medium–large. Likely worth only after 1–5.
Rollout
- Ship 1–4 behind a CI job that runs but does not block, for a 2-week observation window. Count true-positives vs false-positives.
- Promote to blocking once the false-positive rate is known and the baseline-diff UX is sound.
- 5–7 follow later as separate PRs.
Non-goals
- Changing any existing contributor-facing workflows beyond adding checks.
- Replacing the XDM team's human review — these checks catch mechanical mistakes so reviewers can spend their time on design.
Happy to take any or all of these on as follow-up PRs once there's signal on which ones are most valuable to the team.
Context
Companion issue to #2147 (CLAUDE.md). CLAUDE.md documents reviewer-enforced conventions so contributors (human and AI-assisted) can self-correct before opening a PR. This issue proposes programmatic enforcement so contributors don't have to read a doc to get the PR right — CI blocks the PR or the bot comments the fix directly.
CircleCI today runs
npm run lint,mocha test,npm run xed-validation, andnpm run incompatibility-check. It does not runnpm run validatein CI, and most of the repeat reviewer feedback patterns below are not enforced anywhere.Proposed checks
Ordered roughly by value-to-effort ratio. Each item is independently shippable.
1. Run
npm run validatein CIbin/validate-schemas.jscatches example files that reference fields not defined in the schema, enum values not in theenumarray, type mismatches, etc. Running it in CI (gated on "zero new failures vsmasterbaseline" — themasterbaseline has ~58 pre-existing unrelated failures) would catch the single most common PR regression.Effort: add
- run: npm run validateto.circleci/config.yml. The script already exits non-zero on failure. Baseline-diff logic can be added as a follow-up.2.
meta:enumcoverage checkLint rule: for every schema under
components//schemas//extensions/, every value listed in anenumarray must appear as a key in the siblingmeta:enumobject. Missing keys fail CI with a file-and-path pointer.Covers the #1 reviewer feedback ("forgot the
meta:enumkey").Effort: small standalone Node script, callable from
npmscripts. ~50 LOC with AJV's schema walker.3. Example-schema field-drift check
Stricter version of
npm test: for each*.example.*.json, assert everyxdm:-prefixed property path resolves to a definition somewhere in the merged schema (including inheritedallOfchains). Today this is caught implicitly when the schema usesadditionalProperties: false; many XDM schemas don't set that, so invented fields in examples pass mocha.Effort: medium. Reuse
bin/validate-schemas.jsplumbing; add strict-mode wrapper.4.
meta:statuspresence checkEvery new schema (added in the PR diff) must set
meta:status. Today this is reviewer-enforced; agit diff --name-only upstream/master...HEAD | grep '.schema.json$' | xargs jq ...one-liner in CI would flag omissions.Effort: small. Pure shell +
jqin CI.5. PR body / metadata checks (GitHub Action)
An Action that, on PR open / synchronize:
Closes #<n>orFixes #<n>in the bodyneeds reviewlabel before a reviewer is auto-requestedstringproperty to anenum-constrained property and posts a "consider usingmeta:enumon a soft enum instead" commentEffort: medium, one reusable workflow.
6. Auto-fix bot (stretch)
For the subset of issues that have a deterministic fix (missing
meta:enumkeys for values that exist inenum,xdm:prefix missing on a property that the schema defines with the prefix, prettier reformatting), a GitHub Action bot that opens a fix-commit PR branch or leaves suggestions.Effort: large; only worth doing after 1–5 are in place and we have usage data on the most common violations.
7. Reuse-over-reinvent hint (stretch)
Harder to automate, but a
suggest-existing-datatypes.jshelper that takes a new datatype file and surfaces the top-N semantically similar existing datatypes (by title + property-name Jaccard similarity) would save design-review round-trips. Could run as a PR comment, not a blocking check.Effort: medium–large. Likely worth only after 1–5.
Rollout
Non-goals
Happy to take any or all of these on as follow-up PRs once there's signal on which ones are most valuable to the team.