Per-Pack Tests: Best-Effort Verification for Non-Deterministic Installation #319
Replies: 2 comments
-
|
My question is, what about the installation process of PAI packs benefits from the non-determinism of agentic install? Particularly when it comes to the core pack or other base-level PAI systems like the observability monitor or the agent creation pack? Are the agents tweaking/customizing various pack prompts as they install? Apologies if noob question. Happy to be enlightened ;) |
Beta Was this translation helpful? Give feedback.
-
|
The Pack system in v3.0 is the foundation for this — each Pack bundles its own code, and the integrity system does reference validation. Deterministic test scripts per Pack are a good idea for a future contribution. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
The Problem
Every PAI installation is bespoke:
AI-driven installation is non-deterministic. Same pack + same instructions can produce different results depending on what's already installed and how the agent interprets context.
Aside:
validate-pack.tsis very nice and checks structure exists, would like to see similar tests for base PAI signatures to allow for consistent community conventions to be upheld, as humans' PAI-forks slowly deviate due to their user preferences. And these base tests should be in a nice place to easily git pull. For other upstream PAI improvements IMO they should be merged by intention using our own PAI, rather than by explicit deltas.The Insight
If installation is non-deterministic, verification should be as deterministic as possible.
Each pack should include tests that answer: "Did this pack install correctly in this user's bespoke PAI?"
Agent installs however it interprets best → runs tests → pass/fail → agent knows if it succeeded and can retry or adjust.
Proposal
Packs include a
tests/directory with best-effort verification:Agent gets pass/fail feedback to act on and iterate the ralph loop-ish install process, maybe with a sensible failure timeout.
Thoughts?
Beta Was this translation helpful? Give feedback.
All reactions