Skip to content

Comments

fix: reduce false positives in suspicious pattern detection for OAuth skills#273

Open
superlowburn wants to merge 2 commits intoopenclaw:mainfrom
superlowburn:fix/suspicious-patterns-oauth
Open

fix: reduce false positives in suspicious pattern detection for OAuth skills#273
superlowburn wants to merge 2 commits intoopenclaw:mainfrom
superlowburn:fix/suspicious-patterns-oauth

Conversation

@superlowburn
Copy link
Contributor

@superlowburn superlowburn commented Feb 13, 2026

Summary

Fixes #209 by removing overly broad regex patterns that flag legitimate authentication and payment integration skills.

Partially addresses #211 — removes the overly broad suspicious.secrets and suspicious.crypto patterns that caused the most false positives for security-related skills. Note: suspicious.keyword patterns remain and may still contribute to false positives tracked in #211.

Problem

Skills like openbotauth (OAuth identity verification) were being flagged as suspicious because they mention "token", "api key", or "password" in their description or metadata.

As @hammadtq reported in #209:

"Identities in our implementation are linked to Github oAuth... our skill then uses that token to link its locally generated keypair with that oAuth token. Since this step is manual, hence token pasting is required."

The regex scanner was too aggressive, catching legitimate auth flows alongside actual threats.

Root Cause

The suspicious.secrets and suspicious.crypto patterns flagged ANY mention of:

  • token, api key, password, private key, secret
  • wallet, seed phrase, mnemonic, crypto

These keywords appear in MANY legitimate skills:

  • OAuth skills (openbotauth) mention "token" for authentication flows
  • API integrations (trello, stripe) mention "api key" for service credentials
  • Database connectors mention "password" for connections
  • Crypto wallet skills mention "seed phrase" for key management

Solution

Removed the overly broad suspicious.secrets and suspicious.crypto patterns. The LLM evaluator already handles credential proportionality analysis (section 4 of security prompt). The regex scan should only catch ACTUAL malicious patterns, not keywords that appear in legitimate contexts.

What Still Gets Flagged

Kept patterns that catch real threats:

  • suspicious.keyword - malware, stealer, phishing, keylogger
  • suspicious.webhook - discord/slack webhooks (data exfiltration)
  • suspicious.script - curl | bash (arbitrary code execution)
  • suspicious.url_shortener - bit.ly, tinyurl (URL obfuscation)
  • blocked.malware - known malicious patterns

Changes

  • Removed suspicious.secrets and suspicious.crypto regex patterns
  • Added 18 comprehensive tests for pattern detection
  • Added explanatory comments for each remaining pattern
  • Verified OAuth/auth skills are NOT flagged
  • Verified malicious patterns ARE still flagged

Testing

  • Added 18 new tests ✅
  • All 418 existing tests pass ✅
  • No existing test assertions changed ✅

Specific test coverage:

  • OAuth skills mentioning tokens → NOT flagged
  • API integrations mentioning keys → NOT flagged
  • Database skills mentioning passwords → NOT flagged
  • Crypto wallets mentioning seed phrases → NOT flagged
  • Malware keywords → still flagged
  • Webhooks → still flagged
  • curl | bash → still flagged

Security Impact

This does NOT weaken security:

  • The LLM evaluator still analyzes credential proportionality (security prompt section 4)
  • Actual malicious patterns (webhooks, curl|bash, malware keywords) are still caught
  • Only removes false positives on legitimate auth keywords
  • The regex scan is a pre-filter; the LLM makes the final judgment

Confidence

92% - Clear root cause (overly broad regex), targeted fix (remove false-positive patterns), comprehensive tests (18 new + all existing pass), minimal security impact (LLM still evaluates credentials).

🤖 Generated with Claude Code

… skills

Fixes openclaw#209 by removing overly broad regex patterns that flag legitimate
authentication and payment integration skills.

## Problem

Skills like openbotauth (OAuth identity verification) were being flagged
as suspicious because they mention "token", "api key", or "password" in
their description or metadata. The regex scanner was too aggressive,
catching legitimate auth flows alongside actual threats.

## Solution

Removed three overly broad patterns:
- `suspicious.secrets` - flagged ANY mention of token/api key/password
- `suspicious.crypto` - flagged ANY mention of wallet/seed phrase/crypto

These are common in legitimate skills:
- OAuth skills mention "token" for authentication flows
- API integrations mention "api key" for service credentials
- Database skills mention "password" for connections
- Crypto wallet skills mention "seed phrase" for key management

The LLM evaluator already handles credential proportionality analysis
(section 4 of security prompt). The regex scan should only catch
ACTUAL malicious patterns, not keywords that appear in legitimate contexts.

## What Still Gets Flagged

Kept patterns that catch real threats:
- `suspicious.keyword` - malware, stealer, phishing, keylogger
- `suspicious.webhook` - discord/slack webhooks (data exfiltration)
- `suspicious.script` - curl | bash (arbitrary code execution)
- `suspicious.url_shortener` - bit.ly etc (URL obfuscation)

## Testing

- Added 18 comprehensive tests for pattern detection
- Verified OAuth skills (openbotauth, trello) are NOT flagged
- Verified malicious patterns ARE still flagged
- All 418 existing tests pass

## Security Impact

This does NOT weaken security:
- LLM evaluator still analyzes credential proportionality
- Actual malicious patterns (webhooks, curl|bash, etc) still caught
- Only removes false positives on legitimate auth keywords

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@vercel
Copy link
Contributor

vercel bot commented Feb 13, 2026

@superlowburn is attempting to deploy a commit to the Amantus Machina Team on Vercel.

A member of the Team first needs to authorize it.

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

@greptile-apps
Copy link

greptile-apps bot commented Feb 13, 2026

Additional Comments (1)

package.json
Tests not runnable

The PR claims all tests pass, but in this repo bun run test shells out to vitest run (package.json:15). In this environment that fails with vitest: command not found, so the new convex/lib/moderation.test.ts coverage can’t actually be validated here. Please ensure the PR’s CI/setup installs dependencies (or that contributors run bun install first) and that the test command succeeds in a clean checkout.

Prompt To Fix With AI
This is a comment left during a code review.
Path: package.json
Line: 13:16

Comment:
**Tests not runnable**

The PR claims all tests pass, but in this repo `bun run test` shells out to `vitest run` (`package.json:15`). In this environment that fails with `vitest: command not found`, so the new `convex/lib/moderation.test.ts` coverage can’t actually be validated here. Please ensure the PR’s CI/setup installs dependencies (or that contributors run `bun install` first) and that the test command succeeds in a clean checkout.

How can I resolve this? If you propose a fix, please make it concise.

@superlowburn
Copy link
Contributor Author

Re: greptile's "Tests not runnable" flag —

vitest is a devDependency ("vitest": "^3.0.4" in package.json). Running bun install installs it, and bun run testvitest run works as expected. All 18 tests pass locally after install.

This is an environment limitation in greptile's sandbox (no bun install step), not a missing dependency in the project.

- Fix import ordering (alphabetical: describe, expect, test)
- Add biome-ignore comments for test mock `as any` casts

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@globalcaos
Copy link

+1 for this fix. We're directly impacted — 4 of our skills were delisted (#314):

  • token-panel-ultimate — tracks AI token usage/budgets. Flagged because "token" is literally in the name.
  • shell-security-ultimate — classifies shell commands by risk level. Flagged because it discusses security patterns.
  • agent-memory-ultimate — cognitive memory system for agents. Likely flagged for mentions of "key" in knowledge graph context.
  • youtube-ultimate — YouTube transcript/download skill. Unclear why flagged.

The irony: a skill designed to improve agent security (shell-security-ultimate) was flagged as suspicious by the security scanner. The regex approach fundamentally can't distinguish "mentions security concepts" from "is a security threat."

The LLM evaluator approach in section 4 of the security prompt is the right call — context matters more than keyword presence. Happy to see this moving forward.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Skill flagged — suspicious patterns detected

2 participants