Skip to content

Comments

feat: moderation v2 core backend engine and pipeline#333

Open
ArthurzKV wants to merge 3 commits intoopenclaw:mainfrom
ArthurzKV:codex/skill-verification-v2-clawhub-core
Open

feat: moderation v2 core backend engine and pipeline#333
ArthurzKV wants to merge 3 commits intoopenclaw:mainfrom
ArthurzKV:codex/skill-verification-v2-clawhub-core

Conversation

@ArthurzKV
Copy link

@ArthurzKV ArthurzKV commented Feb 15, 2026

Summary

Introduce moderation v2 backend foundation in ClawHub: normalized verdict/reason/evidence model, deterministic static scanning, publish-time moderation derivation, and backfill support.

Why

Trust decisions need to be consistent and explainable across static, VT, and LLM signals while preserving compatibility with existing moderation fields.

Focused scope

This PR is scoped to one theme: core moderation v2 backend pipeline.

What changed

  • Added normalized moderation fields in convex/schema.ts.
  • Added canonical reason code + verdict utilities in convex/lib/moderationReasonCodes.ts.
  • Added moderation engine in convex/lib/moderationEngine.ts.
  • Integrated deterministic static scan in publish/backfill paths (convex/lib/skillPublish.ts, convex/skills.ts, convex/vt.ts).
  • Updated moderation/public safety logic (convex/lib/moderation.ts, convex/lib/public.ts, convex/lib/skillSafety.ts).
  • Follow-up fixes included:
    • escalateByVtInternal moderation flag overwrite bug
    • backfill cursor skip edge case
    • child_process false-positive fallback in scanner
    • rule name alignment to suspicious.nonstandard_network

Local validation

  • bun run lint:oxlint
  • bunx vitest run convex/lib/moderationEngine.test.ts convex/skills.rateLimit.test.ts

AI assistance transparency

  • AI-assisted: Yes (implemented with Codex assistance)
  • Testing level: Targeted local validation on touched modules
  • I reviewed the final diffs and understand the behavior changes.

@vercel
Copy link
Contributor

vercel bot commented Feb 15, 2026

@ArthurzKV is attempting to deploy a commit to the Amantus Machina Team on Vercel.

A member of the Team first needs to authorize it.

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

10 files reviewed, 3 comments

Edit Code Review Agent Settings | Greptile

@ArthurzKV
Copy link
Author

Addressed review feedback in follow-up commits:

  • ac8fde6:
    • fixed escalateByVtInternal so moderationFlags are not overwritten after merge logic.
    • fixed backfill cursor advancement to avoid skipping a candidate at batch boundaries.
    • fixed child_process exec guard so fallback line text does not create false positives.
  • beba7a0: renamed network reason code to suspicious.nonstandard_network for naming consistency.

Validation run: lint + moderation engine/rate-limit tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant