Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
101 changes: 101 additions & 0 deletions .github/workflows/publish.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,101 @@
name: Publish to GitHub Packages

on:
push:
tags:
- 'v*'

# Concurrency: one publish at a time per tag.
concurrency:
group: publish-${{ github.ref }}
cancel-in-progress: false

# Workflow-level minimal permissions. `packages: write` is what
# `npm publish --registry=https://npm.pkg.github.com` actually needs;
# `contents: read` lets actions/checkout fetch the tagged commit.
permissions:
contents: read
packages: write
id-token: write

jobs:
publish:
name: Publish @padosoft/agentic-qa-kit to GitHub Packages
runs-on: ubuntu-latest
timeout-minutes: 15
steps:
- name: Checkout tagged commit
uses: actions/checkout@v4
with:
# The tag itself, not the default branch, so the published
# artifact matches the release notes.
ref: ${{ github.ref }}
fetch-depth: 1

- name: Pin Bun version
uses: oven-sh/setup-bun@v2
with:
bun-version-file: .bun-version

- name: Setup Node 22 (publisher)
uses: actions/setup-node@v4
with:
node-version: '22'
# Configure npm for GitHub Packages auth — actions/setup-node
# writes an .npmrc with the right token + registry.
registry-url: 'https://npm.pkg.github.com'
scope: '@padosoft'
always-auth: true

- name: Verify tag matches packages/kit version
# The publish workflow is tag-triggered. The kit's package.json
# version MUST match the tag (modulo a leading 'v') so the
# published tarball's metadata aligns with the release.
run: |
TAG="${GITHUB_REF##*/}"
TAG_VERSION="${TAG#v}"
PKG_VERSION="$(node -e "console.log(require('./packages/kit/package.json').version)")"
echo "tag=$TAG_VERSION pkg=$PKG_VERSION"
if [ "$TAG_VERSION" != "$PKG_VERSION" ]; then
echo "::error::Tag $TAG_VERSION does not match packages/kit/package.json version $PKG_VERSION"
exit 1
fi

- name: Install workspaces
run: bun install --frozen-lockfile

- name: Build the whole monorepo (so kit's bundle has all deps available)
run: bun run build

- name: Verify built bundle exists
# The build step writes dist/cli.cjs + dist/cli.bundle.meta.json
# via packages/kit/scripts/build-bundle.mjs. Fail fast if not.
run: |
if [ ! -f packages/kit/dist/cli.cjs ]; then
echo "::error::packages/kit/dist/cli.cjs missing — build-bundle.mjs did not run"
exit 1
fi
if [ ! -f packages/kit/dist/cli.bundle.meta.json ]; then
echo "::error::publish bundle meta missing — partial build"
exit 1
fi
ls -lh packages/kit/dist/cli.cjs
cat packages/kit/dist/cli.bundle.meta.json

- name: Rewrite name + workspace:* deps for publish
# publish-prep.mjs swaps @aqa/kit → @padosoft/agentic-qa-kit and
# pins every workspace:* dep to the kit's current version. The
# rewrite is local to this CI checkout and never committed back.
run: node packages/kit/scripts/publish-prep.mjs

- name: npm publish (GitHub Packages)
working-directory: packages/kit
env:
# actions/setup-node@v4 wires NODE_AUTH_TOKEN into the .npmrc
# it generated; npm picks it up automatically when publishing
# to a scope-bound registry.
NODE_AUTH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
# --provenance + --access public + the GH-Packages registry is
# the complete publish contract. The kit's package.json already
# carries publishConfig.access=public + publishConfig.registry.
run: npm publish --provenance --access public
38 changes: 38 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,44 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## [Unreleased]

## [1.9.0] — 2026-05-21 — Junior quick-start truthing

The v1.x roadmap is fully closed and the kit ships to GitHub Packages as `@padosoft/agentic-qa-kit`. This release closes the four gaps an external junior would have hit if they tried to follow the README quick-start in v1.8:

### Added

- **`aqa install-agent-files --targets …`** CLI verb (PR #52). Cables the existing `renderForTargets()` from `@aqa/adapters` into a real command. Generates `CLAUDE.md` + `AGENTS.md` + `GEMINI.md` + `.github/copilot-instructions.md` plus per-agent skills under `.claude/`, `.agents/`, `.gemini/`, `.github/`. Flags: `--targets <csv|repeat>`, `--project-name <slug>`, `--force`, `--dry-run`. Unknown target fails fast without writing anything. Existing files preserved unless `--force`.
- **`aqa report [--run-id <id>] [--format md|json|both]`** CLI verb (PR #53). Renders `events.jsonl` + `findings.jsonl` from a run into `report.md` (auditor-friendly) and `report.json` (stable shape consumed by the admin UI). Defaults to the latest run by file mtime so hash-suffixed `--seed` ids work alongside ISO-prefixed ones. Strict on bad inputs: missing artifacts, malformed JSONL, traversal in `--run-id`, symlinked run dirs all error fast.
- **`aqa admin [--port <n>] [--host <h>]`** CLI verb (PR #54). Boots the admin SPA + `makeApi()` in a single Node process on `http://127.0.0.1:5173`. The bundled SPA ships inside the kit tarball; the in-memory store is seeded from `.aqa/runs/<id>/{events,findings}.jsonl` so the admin shows real local data out of the box. Path-traversal-safe static serving with SPA fallback for client-side routes.
- **`@aqa/pack-author`** new workspace package (PR #54). Extracted `runPackNew` from `@aqa/kit` to break the kit↔server build cycle that emerged when kit started depending on server (via the new `aqa admin` command). `@aqa/server`'s `POST /api/packs/scaffold` and `@aqa/kit`'s `aqa pack new` both consume it. Kit keeps a 5-line re-export shim so existing in-kit imports work unchanged.
- **GitHub Packages publish pipeline** (PR #55). New `.github/workflows/publish.yml` runs on every `v*` tag and publishes `@padosoft/agentic-qa-kit` to `https://npm.pkg.github.com` with `--provenance`. The kit publishes as a single bundled `dist/cli.cjs` (~460 KB) via esbuild — every `@aqa/*` workspace dep + every npm dep is inlined; only Node built-ins stay external. `packages/kit/scripts/publish-prep.mjs` swaps `@aqa/kit` → `@padosoft/agentic-qa-kit` and pins `workspace:*` deps to the kit's version at publish time only (the workspace keeps its internal name so other packages can keep referencing it).
- **README + `docs/getting-started.md` rewritten** to match the actually-shipped CLI surface. Adds the `.npmrc` snippet for GH Packages auth, the 10-step quick-start, the `aqa admin` boot path, and a single-command `bun run e2e:ecosystem` pointer for monorepo contributors.

### Changed

- **kit `package.json` `name` field policy.** The workspace name stays `@aqa/kit` so other monorepo packages can reference it. The published artifact's `name` is set at publish time from the new `aqa.publishName` declaration (`@padosoft/agentic-qa-kit`). This dual-naming keeps internal imports stable while satisfying GH Packages' `<scope> === <repo-owner>` requirement.
- **Bundle format.** Kit now ships as `dist/cli.cjs` (CJS-in-.cjs) instead of separate per-file ESM modules. The `.cjs` extension overrides the package-level `"type": "module"` so Node loads it as CJS and bundled deps that internally `require('process')` resolve cleanly.
- **`packages/server/src/api.ts`** imports `runPackNew` from `@aqa/pack-author` (was `@aqa/kit`).
- **`packages/server/src/index.ts`** re-exports `ApiHandler`, `ApiMethod`, `ApiRequest`, `ApiResponse` so kit can consume them type-only.

### Fixed

- **`doctor.ts` hint no longer says `(Task 4)`** — the verb exists now, so the suggestion is the full command a junior can paste.
- **Slugifier caps at 64 chars** (Slug schema max). Previously a long project directory name would slip through `aqa init` / `aqa install-agent-files` and trip `aqa validate` later. Caps then re-strips trailing dashes so the truncated slug stays schema-conformant.
- **`KNOWN_TARGETS` derived from `@aqa/adapters.adapters`** instead of being a hardcoded duplicate — adding a new adapter (e.g. `opencode`) auto-extends `--targets`.

## [1.8.3] — 2026-05-20 — Live ecosystem e2e + roadmap closure sync

PR #51 — dedicated ecosystem Playwright smoke (`packages/admin/test/e2e/ecosystem-live.e2e.ts`) with a single-command stack bootstrap (`scripts/ecosystem-stack.mjs`). Boots `examples/bun-api`, runs a real `aqa run --profile smoke`, serves live `/api/*` from `@aqa/server.makeApi` + `MemoryStore`, drives the admin against the live backend, asserts `finding_emitted` is visible from `/api/audit` and chain verification returns `CHAIN OK`. Command: `bun run e2e:ecosystem`.

## [1.8.2] — 2026-05-20 — Ecosystem smoke e2e hardening

PR #50 — `scripts/e2e-cli.mjs` no longer stops at version/help/doctor/validate only: boots a local HTTP `/healthz` target, seeds a schema-valid local smoke pack/profile, executes `aqa run --profile smoke` with the real HTTP probe runner, and asserts run artifacts are emitted under `.aqa/runs/<run-id>/`.

## [1.8.1] — 2026-05-20 — Audit chain canonical reconciliation

PR #49 — aligned `@aqa/compliance.verifyEventChain` with `@aqa/runner.EventChainWriter`: hash recomputation excludes `prev_hash` from canonical body, and first-record `prev_hash: null` is canonical instead of expecting all-zero literal.

## [1.3.0] — 2026-05-18

### Added — quality polish (no new packages)
Expand Down
97 changes: 73 additions & 24 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@ Coding agents (Claude Code, Codex CLI, Gemini CLI, GitHub Copilot CLI) are great

## Quick start (junior-friendly)

> **Status note:** the kit reached **v1.0 GA** (24-task roadmap complete) and is now at **v1.1**. The 18 workspace packages (`@aqa/schemas`, `@aqa/kit`, `@aqa/runner`, `@aqa/reporter`, `@aqa/server`, `@aqa/admin`, `@aqa/compliance`, `@aqa/methodology`, …) ship from this monorepo. Detailed walk-through: [`docs/getting-started.md`](docs/getting-started.md).
> **Status note:** the kit reached **v1.0 GA** (24-task roadmap complete) and is now at **v1.9**. The `@padosoft/agentic-qa-kit` CLI ships as a single bundled tarball from GitHub Packages. Detailed walk-through: [`docs/getting-started.md`](docs/getting-started.md).

### 1. Install Bun

Expand All @@ -85,65 +85,111 @@ curl -fsSL https://bun.sh/install | bash
powershell -c "irm bun.sh/install.ps1 | iex"
```

### 2. Install the kit in your project
### 2. Tell your project where to find the kit (GitHub Packages auth)

GitHub Packages requires authentication even for public packages. One-time setup per machine — create a PAT with `read:packages` scope at [github.com/settings/tokens](https://github.com/settings/tokens), then add it to a per-project `.npmrc`:

```ini
# .npmrc — at the root of your project
@padosoft:registry=https://npm.pkg.github.com
//npm.pkg.github.com/:_authToken=${GITHUB_TOKEN}
```

Export the token in your shell (or your CI secrets):

```bash
export GITHUB_TOKEN=ghp_XXXXXXXXXXXXXXXXXXXX
```

### 3. Install the kit in your project

```bash
cd /path/to/your/project
bun add -d agentic-qa-kit
bun add -d @padosoft/agentic-qa-kit
```

> _If you don't have a project yet, clone `examples/bun-api` from this repo (available in v0.1.0)._
> _If you don't have a project yet, clone `examples/bun-api` from this repo as a starting point._

### 3. Initialize the AQA workspace
### 4. Initialize the AQA workspace + verify

```bash
bunx aqa init
bunx aqa init # scaffold .aqa/{project,risk-map,profiles}.yaml + testing.md
bunx aqa doctor # green/yellow/red checklist of kit health
bunx aqa validate # schema-check every .aqa/* file against @aqa/schemas
```

Detects your stack and creates `.aqa/` with `testing.md`, `risk-map.yaml`, `profiles.yaml`, and scenarios for the packs your project matches.
`init` detects your stack (Bun/Node, framework, DB, SUT type) and creates a `.aqa/` directory anchored to the packs your project matches.

### 4. Install agent-specific files (pick one or many)
### 5. Install agent-specific files (one or many)

```bash
bunx aqa install-agent-files --targets claude,codex,gemini,copilot
```

This generates `CLAUDE.md` + `.claude/skills/aqa-*`, `AGENTS.md` + `.agents/skills/`, `GEMINI.md` + `.gemini/skills/`, `.github/copilot-instructions.md` + `.github/skills/`.
Generates `CLAUDE.md` + `.claude/skills/aqa-*`, `AGENTS.md` + `.agents/skills/`, `GEMINI.md` + `.gemini/skills/`, and `.github/copilot-instructions.md` + `.github/skills/`. Existing files are preserved unless you pass `--force`. Add `--dry-run` to see what would change first.

### 5. Run your first agentic QA pass
### 6. Edit `.aqa/risk-map.yaml` (declare what must never break)

Replace the placeholder risk with the one that actually matters for your project. **The risk map is the heart of the kit — generic risks produce generic findings.**

```yaml
- id: r-token-replay
category: auth
title: Tokens remain valid past rotation
severity: critical
likelihood: possible
invariants:
- id: inv-token-rotation
statement: Old tokens become invalid within 60 seconds of rotation.
```

### 7. Run your first agentic QA pass

```bash
bunx aqa run --profile smoke
```

A 10-minute, non-destructive sweep. When it finishes:
A fast, non-destructive sweep. Each run is written to `.aqa/runs/<run-id>/` with `events.jsonl`, `findings.jsonl`, and 3-level replay artifacts (`repro.sh`, `repro.curl`, `repro.playwright.ts`).

### 8. Render the report

```bash
bunx aqa report
bunx aqa report # latest run, Markdown + JSON
bunx aqa report --run-id <id> # explicit run
bunx aqa report --format md # just report.md
```

You'll see findings like:
Writes `report.md` and `report.json` inside the same run directory. You'll see findings like:

```
AQA-2026-0001 [P1] Cross-tenant data leak (verified, 3/3 deterministic replay)
AQA-2026-0002 [P3] Missing rate limit on /api/search
```

### 6. Open the admin panel
### 9. Boot the admin panel (single command)

```bash
bun --filter @aqa/admin dev
bunx aqa admin
```

Then open the local URL shown by Vite (normally `http://127.0.0.1:5173`) and inspect runs, findings, replay artifacts, and audit chain state.
Opens `http://127.0.0.1:5173`. The admin SPA + API server boot in one process, seeded from your local `.aqa/runs/`. Inspect runs, findings, replay artifacts, and verify the hash-chained audit log in-browser. `Ctrl-C` to stop.

| Flag | Effect |
|---|---|
| `--port <n>` | listen on a specific port (default 5173) |
| `--host <h>` | bind host (default `127.0.0.1`; use `0.0.0.0` to expose on LAN) |

### 7. Reproduce from generated artifacts
### 10. Reproduce from generated artifacts

```bash
ls .aqa/runs/<run-id>/
# events.jsonl findings.jsonl report.md report.json
# repro.sh repro.curl repro.playwright.ts
```

Each run stores replay artifacts (`repro.sh`, `repro.curl`, `repro.playwright.ts`) so you can reproduce findings deterministically and confirm fixes.
Each finding ships with a deterministic replay artifact so you can reproduce it, hand it to a teammate, or attach it to a PR.

> **Want the whole ecosystem in one go?** From a clone of `padosoft/agentic-qa-kit`, run `bun run e2e:ecosystem`. It boots `examples/bun-api`, runs a real `aqa run --profile smoke` against it, and opens the admin against the live data. Single command, end-to-end smoke.

## The mental model in 7 words

Expand All @@ -156,12 +202,13 @@ Every concept in AQA is one of these seven things or a tool that operates on the
## How you use it

1. `aqa init`: detect your repo and scaffold `.aqa/`.
2. Edit `risk-map.yaml`: declare what must never break.
3. Install agent files: Claude/Codex/Gemini/Copilot instructions + skills.
2. `aqa install-agent-files --targets …`: write Claude/Codex/Gemini/Copilot instructions + skills.
3. Edit `risk-map.yaml`: declare what must never break.
4. `aqa run --profile smoke`: execute scenarios with probes + oracles.
5. Open admin: `bun --filter @aqa/admin dev`.
6. Inspect findings, replay deterministically, verify audit chain.
7. Iterate risks + scenarios until `release-gate` is green.
5. `aqa report`: render `report.md` + `report.json` from the latest run.
6. `aqa admin`: boot the SPA + API on `127.0.0.1:5173`, seeded from local runs.
7. Inspect findings, replay deterministically, verify audit chain.
8. Iterate risks + scenarios until `release-gate` is green.

## Multi-agent

Expand Down Expand Up @@ -219,10 +266,12 @@ Full diagram: [`docs/architecture/reference.md`](docs/architecture/reference.md)
| `v1.5` | **Admin design integration — shipped** | 30-screen hi-fi prototype bundled, Playwright E2E gate, theme + palette + Findings kanban |
| `v1.6` | **`aqa run` + bundled packs — shipped** | Three-tier pack discovery, atomic run-dir, applies_when filtering, agent-mode rejection until driver lands |
| `v1.7` | **Pack authoring + admin CRUD — shipped** | `PACK-AUTHORING.md`, `aqa pack new`, admin Create-pack/Import-manifest wizards, full Profile/Risk/Scenario CRUD (Delete/Edit/Clone), Agents wired to `/api/agents`, Operations + Admin pages wired to `/api/audit` / `/api/cost/summary` / `/api/queue` / `/api/notifications` / `/api/tokens` / `/api/orgs`, scenario YAML editor, schema-conforming mock-id migration, `Agent` schema, `agents:read`/`agents:edit` permissions, atomic `Store.createProfile/createScenario` |
| `v1.8` | **Live ecosystem e2e — shipped** | Real HTTP probe runner, release-gate finding enforcement, single-command ecosystem stack (`bun run e2e:ecosystem`), Playwright admin-against-live-API smoke, audit-chain canonical reconciliation |
| `v1.9` | **Junior quick-start truthing — shipped** | `aqa install-agent-files` + `aqa report` + `aqa admin` CLI verbs (previously documented but unwired), `@aqa/pack-author` extracted to break kit↔server build cycle, esbuild bundled `dist/cli.cjs`, GitHub Packages publish workflow on `v*` tags, README quick-start rewritten to match the actually-shipped CLI surface |

## Status

**GA (`v1.0` shipped, `v1.7` current).** The full 24-task roadmap is closed:
**GA (`v1.0` shipped, `v1.9` current).** The full 24-task roadmap is closed:
schemas, CLI (`@aqa/kit`), 5 baseline packs, multi-agent adapters
(Claude/Codex/Gemini/Copilot), runner with hash-chained audit, reporter
with 3-level replay, admin panel, server + runner fleet, on-prem LLM
Expand Down
Loading
Loading