Skip to content

improve(icp-cli): accuracy and agent-guidance improvements#159

Open
marc0olo wants to merge 2 commits intomainfrom
improve/icp-cli-157
Open

improve(icp-cli): accuracy and agent-guidance improvements#159
marc0olo wants to merge 2 commits intomainfrom
improve/icp-cli-157

Conversation

@marc0olo
Copy link
Copy Markdown
Member

@marc0olo marc0olo commented Apr 10, 2026

Closes #157.

Summary

  • Add --help directive to prevent hallucinated flags
  • Expand pitfall 6: dfx identity whoamiicp identity default (no-arg getter), setter/getter duality made explicit
  • Pitfall 9: add icp network stop to complete the start/deploy/stop lifecycle
  • Pitfall 17 (new): port conflict resolution — --project-root-override for another icp-cli project, gateway.port + icp network status --json for non-icp services
  • Pitfall 18 (new): icp new hangs in CI without --silent
  • Pitfall 19 (new): anonymous identity must not be used on mainnet — local network seeds all identities, mainnet does not; always switch to a named identity and verify funds before deploying
  • Trim icp new note in "Project Creation" to avoid duplicating pitfall 18
  • Add 3 new evals; update "Deploy to mainnet" eval with balance check behavior; fix adversarial eval behaviors; remove ambiguous mops trigger eval

Evaluation results

Output evals: 21/21 with-skill passing

Eval With skill Baseline
New project setup 7/7 0/7
Deploy to mainnet 5/5 1/5
Migrate from dfx 6/6 0/6
Recipe version pinning 3/3 0/3
Multi-canister environment variables 5/5 1/5
Frontend TypeScript bindings 6/6 0/6
Local network behavior 3/3 0/3
Custom build steps 4/4 1/4
Adversarial: env file pattern 4/4 1/4
Adversarial: dfx commands 4/4 0/4
Candid file generation with recipes 5/5 0/5
Staging and production environments 4/4 0/4
Motoko project setup 6/6 0/6
Full-stack Motoko hello-world with React frontend 9/9 0/9
Adversarial: mixing fields across config styles 4/4 0/4
Adversarial: createActor with old @dfinity/agent pattern 3/3 1/3
Candid opt T representation in bindgen 4/4 1/4
Dev server configuration 6/6 0/6
Port conflict on local network start ⭐ 5/5 0/5
Scripted project creation in CI ⭐ 3/3 0/3
Adversarial: deploying to mainnet with anonymous identity ⭐ 5/5 0/5

Trigger evals: 24/25 — removed the one miss ("Set up a Motoko canister with mops" legitimately matched the motoko skill; query was too ambiguous to be a useful icp-cli trigger)


Deploy to mainnet (updated — added balance check behavior)

With skill ✅ 5/5

✓ Uses 'icp deploy -e ic', NOT 'dfx deploy --network ic' or '--network ic'
✓ Mentions cycles are needed
✓ Mentions canister IDs are stored in .icp/data/ and should be committed to git
✓ Does NOT use --network ic flag for deployment
✓ Recommends checking ICP or cycles balance on mainnet before deploying (icp token balance -n ic or icp cycles balance -n ic)

Baseline ❌ 1/5

✗ Uses 'icp deploy -e ic' — uses dfx deploy --network ic instead
✓ Mentions cycles are needed
✗ Mentions canister IDs stored in .icp/data/
✗ Does NOT use --network ic
✗ Recommends checking balance
Port conflict on local network start (new)

With skill ✅ 5/5

✓ Distinguishes between two scenarios: another icp-cli project vs. a non-icp service
✓ For another icp-cli project: provides 'icp network stop --project-root-override /path/to/other-project'
✓ For a non-icp service: recommends configuring gateway.port in icp.yaml
✓ Mentions 'icp network status --json' to read gateway URL dynamically
✓ Does NOT suggest killing processes manually or using dfx commands

Baseline ❌ 0/5

✗ No scenario distinction — treats all port conflicts generically
✗ Suggests dfx stop and kill -9 instead
✗ Recommends dfx.json with bind key instead of icp.yaml gateway.port
✗ Never mentions icp network status --json
✗ Explicitly recommends kill -9 and dfx commands
Scripted project creation in CI (new)

With skill ✅ 3/3

✓ Uses --silent flag with icp new
✓ Passes --subfolder and --define flags
✓ Uses icp (not dfx) for project scaffolding

Baseline ❌ 0/3

✗ Uses dfx new --type rust --no-frontend instead
✗ No --subfolder or --define flags
✗ Uses dfx, not icp
Adversarial: deploying to mainnet with the anonymous identity (new)

With skill ✅ 5/5

✓ Warns that anonymous identity is shared and uncontrolled on mainnet
✓ Recommends icp identity default / icp identity new (not dfx identity use)
✓ Recommends icp token balance -n ic / icp cycles balance -n ic (not dfx ledger balance)
✓ Mentions new identity needs to be funded before deploying
✓ Presents identity switch and balance check as required prerequisites

Baseline ❌ 0/5

✗ Never mentions anonymous identity or its shared nature
✗ Uses dfx identity whoami / dfx identity get-principal
✗ Uses dfx wallet --network ic balance
✗ Only vaguely says "identity needs to be funded"
✗ Redirects to dfx entirely, frames checks as optional

- Add --help directive to prevent hallucinated flags
- Expand pitfall 6 with dfx identity whoami → icp identity default (getter)
- Add icp network stop to complete lifecycle in pitfall 9
- Add pitfall 17: port conflict resolution (--project-root-override and gateway.port)
- Add pitfall 18: icp new hangs in CI without --silent
- Strengthen icp new note with --silent and CI-hanging warning
- Add evals for port conflict and scripted project creation
@marc0olo marc0olo requested review from a team and JoshDFN as code owners April 10, 2026 08:46
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 10, 2026

Skill Validation Report

Validating skill: /home/runner/work/icskills/icskills/skills/icp-cli

Structure

  • Pass: SKILL.md found
  • Pass: all files in references/ are referenced

Frontmatter

  • Pass: name: "icp-cli" (valid)
  • Pass: description: (507 chars)
  • Pass: license: "Apache-2.0"
  • Pass: metadata: (2 entries)

Tokens

  • Warning: SKILL.md body is 5122 tokens (spec recommends < 5000)

Markdown

  • Pass: no unclosed code fences found

Tokens

File Tokens
SKILL.md body 5,122
references/binding-generation.md 1,031
references/dev-server.md 690
references/dfx-migration.md 2,620
Total 9,463

Content Analysis

Metric Value
Word count 2,654
Code block ratio 0.21
Imperative ratio 0.12
Information density 0.17
Instruction specificity 0.83
Sections 17
List items 59
Code blocks 30

References Content Analysis

Metric Value
Word count 2,133
Code block ratio 0.25
Imperative ratio 0.13
Information density 0.19
Instruction specificity 0.80
Sections 18
List items 33
Code blocks 12

Contamination Analysis

Metric Value
Contamination level low
Contamination score 0.12
Primary language category shell
Scope breadth 3
  • Warning: Language mismatch: config, javascript (2 categories differ from primary)

References Contamination Analysis

Metric Value
Contamination level low
Contamination score 0.03
Primary language category javascript
Scope breadth 2
  • Warning: Language mismatch: shell (1 category differ from primary)

Result: 1 warning

Project Checks


✓ Project checks passed for 1 skills (0 warnings)

- Pitfall 19: warn against using the anonymous identity on mainnet —
  local network seeds all identities so local always works, but on
  mainnet the anonymous identity is shared and uncontrolled
- Trim Project Creation section to remove duplicated CI warning (kept in pitfall 18)
- Update 'Deploy to mainnet' eval: add balance check as expected behavior
- Add adversarial eval for deploying to mainnet with anonymous identity
- Fix adversarial eval behaviors: require icp commands explicitly, reframe
  last behavior as positive assertion to avoid false failures
- Remove mops trigger eval: query was ambiguous, legitimately matched motoko skill
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

icp-cli skill: accuracy and agent-guidance improvements

1 participant