fix: clear stale suspicious flag when VT verdict improves#418
fix: clear stale suspicious flag when VT verdict improves#418Phineas1500 wants to merge 2 commits intoopenclaw:mainfrom
Conversation
The daily VT rescan updated vtAnalysis on the version but only called escalateByVtInternal for suspicious/malicious verdicts. When a verdict improved from suspicious to clean, the version's vtAnalysis was updated (website shows "Benign") but the skill's moderationFlags kept the stale "flagged.suspicious" entry (CLI warns "suspicious"). Now the rescan calls approveSkillByHashInternal to clear the flag on de-escalation.
|
@Phineas1500 is attempting to deploy a commit to the Amantus Machina Team on Vercel. A member of the Team first needs to authorize it. |
convex/vt.ts
Outdated
| } else if (wasFlagged) { | ||
| // Verdict improved from suspicious → clean: clear the stale moderation flag | ||
| console.log(`[vt:rescan] ${slug}: verdict improved to clean, clearing suspicious flag`) | ||
| await ctx.runMutation(internal.skills.approveSkillByHashInternal, { | ||
| sha256hash, | ||
| scanner: 'vt', | ||
| status, | ||
| }) | ||
| accUpdated++ |
There was a problem hiding this comment.
Pending verdict also clears suspicious flag
The else if (wasFlagged) branch runs for any status that isn't malicious or suspicious, which includes 'pending' (returned by verdictToStatus for unrecognized verdict strings). If VT ever introduces a new verdict category not in BENIGN_VERDICTS/MALICIOUS_VERDICTS/SUSPICIOUS_VERDICTS, a previously-flagged skill would have its suspicious flag cleared even though the verdict is inconclusive.
Consider guarding this branch to only clear on a confirmed clean verdict:
| } else if (wasFlagged) { | |
| // Verdict improved from suspicious → clean: clear the stale moderation flag | |
| console.log(`[vt:rescan] ${slug}: verdict improved to clean, clearing suspicious flag`) | |
| await ctx.runMutation(internal.skills.approveSkillByHashInternal, { | |
| sha256hash, | |
| scanner: 'vt', | |
| status, | |
| }) | |
| accUpdated++ | |
| } else if (wasFlagged && status === 'clean') { | |
| // Verdict improved from suspicious → clean: clear the stale moderation flag | |
| console.log(`[vt:rescan] ${slug}: verdict improved to clean, clearing suspicious flag`) | |
| await ctx.runMutation(internal.skills.approveSkillByHashInternal, { | |
| sha256hash, | |
| scanner: 'vt', | |
| status, | |
| }) | |
| accUpdated++ |
Prompt To Fix With AI
This is a comment left during a code review.
Path: convex/vt.ts
Line: 837:845
Comment:
**Pending verdict also clears suspicious flag**
The `else if (wasFlagged)` branch runs for any `status` that isn't `malicious` or `suspicious`, which includes `'pending'` (returned by `verdictToStatus` for unrecognized verdict strings). If VT ever introduces a new verdict category not in `BENIGN_VERDICTS`/`MALICIOUS_VERDICTS`/`SUSPICIOUS_VERDICTS`, a previously-flagged skill would have its suspicious flag cleared even though the verdict is inconclusive.
Consider guarding this branch to only clear on a confirmed clean verdict:
```suggestion
} else if (wasFlagged && status === 'clean') {
// Verdict improved from suspicious → clean: clear the stale moderation flag
console.log(`[vt:rescan] ${slug}: verdict improved to clean, clearing suspicious flag`)
await ctx.runMutation(internal.skills.approveSkillByHashInternal, {
sha256hash,
scanner: 'vt',
status,
})
accUpdated++
```
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
Pull request overview
This PR attempts to fix a bug where skills flagged as suspicious by VirusTotal (VT) retain the flagged.suspicious moderation flag even after VT rescan returns a clean verdict. The rescan function updates the version's vtAnalysis.status but previously did not clear the stale flag from the skill's moderationFlags.
Changes:
- Added
wasFlaggedfield to track skills withflagged.suspiciousflag during rescan batch queries - Modified rescan logic to call
approveSkillByHashInternalwhen a previously-flagged skill's VT verdict improves to clean
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| convex/skills.ts | Added wasFlagged boolean field to getActiveSkillBatchForRescanInternal to identify skills with existing suspicious flags |
| convex/vt.ts | Added logic to call approveSkillByHashInternal when wasFlagged is true and verdict is no longer suspicious/malicious |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
convex/vt.ts
Outdated
| } else if (wasFlagged) { | ||
| // Verdict improved from suspicious → clean: clear the stale moderation flag | ||
| console.log(`[vt:rescan] ${slug}: verdict improved to clean, clearing suspicious flag`) | ||
| await ctx.runMutation(internal.skills.approveSkillByHashInternal, { | ||
| sha256hash, | ||
| scanner: 'vt', | ||
| status, | ||
| }) | ||
| accUpdated++ |
There was a problem hiding this comment.
The condition wasFlagged will execute for any status that is not 'malicious' or 'suspicious', including 'pending'. When verdictToStatus returns 'pending' (for unknown verdicts), calling approveSkillByHashInternal will treat it as clean (since isClean = !isMalicious && !isSuspicious) and clear the suspicious flag even though the verdict is actually pending, not clean. The condition should explicitly check that the status is 'clean' before clearing flags.
| await ctx.runMutation(internal.skills.approveSkillByHashInternal, { | ||
| sha256hash, | ||
| scanner: 'vt', | ||
| status, | ||
| }) |
There was a problem hiding this comment.
The call to approveSkillByHashInternal will not clear the suspicious flag for non-privileged users due to a bug in that function's logic. In approveSkillByHashInternal at line 2614, the condition (isSuspicious || alreadyFlagged) && !bypassSuspicious will match when a previously-flagged skill gets a clean verdict, causing it to keep newFlags = ['flagged.suspicious'] instead of clearing the flag. The isClean branch that checks for other scanners (line 2617-2624) never executes because the previous condition matches first. This means the PR's intended fix will not work as described - skills will remain flagged even when VT returns a clean verdict.
convex/vt.ts
Outdated
| } else if (wasFlagged) { | ||
| // Verdict improved from suspicious → clean: clear the stale moderation flag | ||
| console.log(`[vt:rescan] ${slug}: verdict improved to clean, clearing suspicious flag`) | ||
| await ctx.runMutation(internal.skills.approveSkillByHashInternal, { | ||
| sha256hash, | ||
| scanner: 'vt', | ||
| status, | ||
| }) | ||
| accUpdated++ |
There was a problem hiding this comment.
This new functionality lacks test coverage. There should be a test that verifies a previously-flagged skill has its flagged.suspicious flag cleared when VT returns a clean verdict during rescan. This is especially important given the complexity of the multi-scanner flag merging logic in approveSkillByHashInternal.
…ing clean verdicts Address review feedback: - Guard rescan de-escalation with `status === 'clean'` so pending/unknown verdicts don't accidentally clear the suspicious flag - Fix approveSkillByHashInternal where `alreadyFlagged` in the condition `(isSuspicious || alreadyFlagged) && !bypassSuspicious` prevented clean verdicts from reaching the isClean branch that properly checks whether a different scanner set the flag
Summary
rescanActiveSkills) updatesvtAnalysison the version when a verdict changes, but only calledescalateByVtInternalfor suspicious/malicious verdictsvtAnalysis.statuswas updated to"clean"(website shows "Benign") but the skill'smoderationFlagskept the stale"flagged.suspicious"entry (CLI warns "suspicious", website shows warning banner)approveSkillByHashInternalto clear the flag when a previously-flagged skill's verdict improves to cleanReported via https://clawhub.ai/pskoett/self-improving-agent — both scanners say Benign but the skill is still flagged.
Test plan
rescanActiveSkillsclearsflagged.suspiciousfor skills whose VT verdict improved to cleanaccUnchangedGreptile Summary
Fixes a bug where the daily VT rescan would update a version's
vtAnalysis.statusto"clean"when the verdict improved, but left a staleflagged.suspiciousentry in the skill'smoderationFlags. This caused the CLI to warn "suspicious" and the website to show a warning banner even after VT declared the skill benign.convex/skills.ts: Adds awasFlaggedboolean to the rescan batch query, indicating whether the skill currently hasflagged.suspiciousin itsmoderationFlags.convex/vt.ts: UseswasFlaggedin the rescan loop to callapproveSkillByHashInternalwhen a previously-flagged skill's verdict improves, clearing the stale moderation flag.approveSkillByHashInternalensures that if another scanner (e.g., LLM) has independently flagged the skill, the flag is preserved.Confidence Score: 4/5
approveSkillByHashInternalhandles multi-scanner scenarios properly. One minor logic concern: theelse if (wasFlagged)branch doesn't guard againststatus === 'pending', which could clear a suspicious flag on an inconclusive verdict. In practice this edge case is unlikely given VT's current verdict categories, but adding&& status === 'clean'would be more defensive.convex/vt.ts— theelse if (wasFlagged)condition at line 837 could be tightened to only trigger onstatus === 'clean'.Last reviewed commit: a17d333
Context used:
dashboard- AGENTS.md (source)