Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
148 changes: 148 additions & 0 deletions .agents/skills/implementing/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,148 @@
---
name: implementing
description: Non-interactive sandbox implementation from an approved spec to a PR. Use when the user mentions "implement", "implement spec", "implement specs/<feature>.md", sandbox implementation, or autonomous implementation from a specification.
license: MIT
metadata:
author: VTEX
version: "1.2.1"
---

# Implementing — Spec to Code

Implement a feature **autonomously** from an approved `specs/<feature-name>.md`.

## Execution model

This skill assumes a **sandboxed, asynchronous** run: **no human interaction** during execution (no questions, confirmations, or waiting for replies).

| | |
|---|---|
| **Input** | `specs/<feature-name>.md` (must be `Approved` — see Phase 1) |
| **Success output** | Pull Request on `feat/<feature-name>`, spec status → `Done` |
| **Blocked output** | Agent **ends** without a completed feature PR; a **GitHub issue** documents why implementation could not finish |

**Pipeline (end-to-end):** the **specification** skill produces the spec; this skill consumes it. There is **no intermediate planning artifact or skill** — you plan, decompose work, and sequence tasks internally (same idea as composable workflows in [superpowers](https://github.com/obra/superpowers), but without an extra handoff step). The spec is the single handoff contract, analogous to how [autoresearch](https://github.com/karpathy/autoresearch) treats `program.md` as the human-authored context and keeps a tight edit surface for the agent.

**Operating loop:** pick a user story → implement with tests → verify against acceptance criteria → commit → repeat until every story is done. **Evidence over claims:** nothing is “done” until tests and checks prove it.

## Constraints

**What you CAN do:**
- Create and modify source code, tests, and configuration files required by the spec
- Install dependencies explicitly mentioned in the Technical Contract or required by the chosen approach
- Create branches, commit, and open PRs
- Open a **GitHub issue** (or equivalent) when implementation is **impossible** to complete after tie-breakers and reasonable effort — see *Non-interactive execution*

**What you CANNOT do:**
- Implement features, endpoints, or behaviors not described in the spec
- Change the spec file itself — if the spec is wrong, contradictory, or blocking, follow *Non-interactive execution* (assumptions in PR, or issue + terminate)
- Skip writing tests for a user story that has acceptance criteria
- Merge or push to the main branch without explicit user consent (from platform policy outside this run)
- Execute DDL, DML, or other operations that change the schema or data of any database the application connects to via its configured environment — including applying migrations against that connection or running seeds, fixtures, or resets against it

**The goal is simple: make every acceptance criterion pass.** The spec is the single source of truth. If the spec says it, implement it. If the spec doesn't say it, don't.

## Workflow

### Phase 1: Load and validate

1. Read the full `specs/<feature-name>.md`
2. Check that the status is `Approved`. If it is **not** `Approved` (e.g. `Draft`): **end the run** immediately. Do **not** open an implementation PR. Emit a **structured report** in sandbox logs/output (reason, current status, path to the spec file) so the orchestrator can mark the job failed.
3. Extract from the spec:
- **User Stories + Acceptance Criteria** → the work units
- **Key Scenarios** → the test cases
- **Arch Decisions** → the technical approach and constraints
- **Technical Contract** → interfaces, models, and boundaries to implement exactly
4. If the spec references repositories you don't have context on, use the GitHub tool to fetch their structure, README, and dependencies

**Internal planning:** derive order of work, file touch list, and test strategy from the spec and repo. No separate plan-approval step.

### Phase 2: Codebase reconnaissance

Before writing any code, understand the existing codebase:

- Project structure, conventions, and patterns already in use
- Existing tests: framework, naming conventions, where they live
- Dependency management: what's already installed, what's available
- CI/CD: how tests are run, linting rules, build steps

Adapt your implementation to match existing patterns. Don't introduce new conventions unless the spec explicitly calls for it.

### Phase 3: Implementation loop

Work through user stories one at a time. For each story:

```
LOOP per user story:

1. Write failing tests derived from the acceptance criteria and key scenarios
2. Run the tests — confirm they fail for the right reason
3. Implement the minimal code to make the tests pass
4. Run the tests — confirm they pass
5. Refactor if needed (tests must still pass after)
6. Commit with a descriptive message
7. Move to the next story
```

**Rules:**
- Follow the architecture described in Arch Decisions — don't contradict accepted decisions
- Implement interfaces and models exactly as defined in the Technical Contract
- One commit per user story (or per logical unit if a story is large)
- Commit messages follow the repo's existing convention; if none exists, use: `feat: <what was implemented>`
- If a test fails after implementation and you cannot fix it after **reasonable attempts** (e.g. 3+ focused tries), treat this as **blocking** → *Non-interactive execution* (issue + terminate). Do not force a green build by gutting tests or the spec.

**If something goes wrong:**
- Implementation breaks existing tests → fix the regression before moving on; if the fix is not achievable without violating the spec or breaks invariants, treat as **blocking** → issue + terminate
- Ambiguity in acceptance criteria → apply **tie-break order**: **Key Scenarios** → **Arch Decisions** → **Technical Contract** → **repository conventions**. If ambiguity is **resolved**, document **Assumptions** in the PR body (dedicated section). If still **unresolvable** (no coherent reading), treat as **blocking** → issue + terminate
- A dependency is missing → install it if possible and note it in the PR summary; if it cannot be installed in the sandbox, treat as **blocking** → issue + terminate
- The spec contradicts itself in a way tie-breakers cannot reconcile → **blocking** → issue + terminate; do not guess

### Phase 4: Verification

After all user stories are implemented:

1. Run the full test suite — everything must pass
2. Walk through the Key Scenarios table and confirm each scenario is covered by a test
3. Check that no files were modified outside the scope of the spec
4. Review your own changes: look for leftover debug code, TODOs, or unused imports

### Phase 5: Deliver (success path)

1. Update the spec status from `Approved` to `Done`
2. Open a Pull Request:
- Branch: `feat/<feature-name>`
- Title: `feat: <feature-name>`
- Body: use **sections** (not a live chat summary):
- **Summary** — what was implemented, per user story
- **Tests** — what was added or changed
- **Assumptions** — explicit assumptions from tie-breakers (omit section if none)
- **Deviations** — any deviation from the spec with justification (omit if none)
- **Follow-ups** — risks or optional next steps (omit if none)
- **Spec** — link to `specs/<feature-name>.md`
- Reference the spec PR if it exists

## Non-interactive execution

There is **no** human in the loop during the run: do not ask questions or wait for answers.

**Success:** proof lives in the **PR description** and **passing tests** (and CI, if applicable).

**Blocked — impossible to complete the spec:**
1. **Stop** — do not open a PR that claims the feature is done.
2. **Open a GitHub issue** with at minimum:
- Title pattern: `implementing blocked: <feature-name>`
- Link to `specs/<feature-name>.md`
- What was attempted (stories, commits, branches if any)
- Objective reason for the block (contradiction, missing dependency, irresolvable ambiguity, tests/CI that cannot be satisfied, etc.)
- Evidence: relevant spec excerpts, error output, failing test names
3. **End the agent run.** Sandbox logs should reflect failure for the orchestrator.

Optional: push a WIP branch only if the orchestrator requires a trace — the **issue** is the primary failure artifact.

## Important

- The spec is law — don't add features, don't skip requirements
- Tests are not optional — every acceptance criterion must have a corresponding test
- Match existing codebase patterns — don't impose new conventions
- Keep commits atomic — one story, one commit
- Never silently work around a broken spec — resolve with documented assumptions in the PR, or file an issue and stop
150 changes: 150 additions & 0 deletions .agents/skills/specification/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,150 @@
---
name: specification
description: Generate a Spec Driven Development (SDD) document containing Business Context, Arch Decisions, and Technical Contract. Use when the user mentions "spec", "init spec", "create specification", references a file in the specs/ folder, or wants to create a feature specification.
license: MIT
metadata:
author: VTEX
version: "1.2.0"
---

# Specification — Spec Driven Development

Generate a complete SDD for a feature and write it as a structured markdown file to `specs/<feature-name>.md`. Uses a **hybrid approach**: for simple, well-described features the agent generates the full spec in a single pass; for complex or ambiguous features it gathers requirements interactively before writing.

**End-to-end flow:** this skill is the **start** of the pipeline; **implementing** is the **end**. After the spec is `Approved`, the **implementing** skill takes over. There is **no required intermediate step** (no separate planning doc or skill) — capable agents plan and break work internally from the approved spec.

## Sections

A specification always contains all three sections:

| # | Section | Purpose | Depends on |
|---|---------|---------|------------|
| 1 | **Business Context** | Problem, goals, requirements, acceptance criteria, key scenarios | — |
| 2 | **Arch Decisions** | Technical approach, plan & key architecture decisions | Business Context |
| 3 | **Technical Contract** | Interfaces, models, boundaries | Business Context + Arch Decisions |

## Workflow

### Phase 1: Identify feature

Determine what feature the specification is for (to derive the filename). If the user's message already names the feature, move on. Otherwise, ask.

### Phase 2: Repository context

If the user did **not** mention which repository (or repositories) the feature relates to, **ask before proceeding**. Use the AskQuestion tool so the user can type or select the repo(s).

Once you know the repositories, use the **GitHub** tool to fetch context and understand what already exists:

- Repository metadata (description, language, topics)
- Languages and tech stack breakdown
- README contents (project overview)
- Dependency files (`package.json`, `requirements.txt`, `go.mod`, etc.)
- Existing specs in the `specs/` directory (to avoid duplication)

Use the gathered context to build a **repository profile** containing:
- Primary language and tech stack
- Project purpose and domain
- Existing patterns, frameworks, and conventions
- Known dependencies and integration points
- Existing specifications (to avoid duplication)

This profile is used in Phase 2.5 to decide whether the agent can generate the spec directly or needs to ask more questions.

### Phase 2.5: Assess completeness

Evaluate whether the user's initial message **plus** the repository profile already provide enough information to generate all three sections. Check each section against its core questions:

| Section | Can generate if you know… |
|---|---|
| **Business Context** | The problem, who it affects, expected outcome, acceptance criteria per story, and key scenarios |
| **Arch Decisions** | The proposed technical approach, its trade-offs, and key architecture decisions |
| **Technical Contract** | The interfaces, models, or boundaries involved |

**Decision rules:**

- **All sections covered** → skip Phase 3 entirely, go to Phase 4 (single-pass)
- **Some gaps** → ask only about the gaps (targeted discovery)
- **Mostly unknown** → run full Phase 3 (interactive)

When in doubt, prefer asking over assuming — a wrong spec is worse than a slow one.

### Phase 3: Discovery (may be skipped)

> **Skip this phase** if Phase 2.5 determined all sections can be generated (single-pass).

Gather information through conversation. Use the repository profile from Phase 2 and the gap analysis from Phase 2.5 to guide which questions to ask.

**Rules:**
- **Only ask about gaps identified in Phase 2.5** — don't re-cover what's already known
- **Skip questions that the repo context already answers** (e.g., don't ask about tech stack if `package.json` reveals it)
- **Pre-fill what you can infer** and confirm with the user instead of asking from scratch (e.g., "Based on the repo, this is a Node.js service using Express — is that correct?")
- **Only ask about what's genuinely unknown** — focus on intent, business rules, and decisions that code alone can't reveal

**Business Context questions — Problem & Requirements**
- What problem are we solving?
- Who is affected and how?
- What happens if we don't solve it?
- What are the expected outcomes?
- What are the functional and non-functional requirements?
- Are there constraints or dependencies?
- For each user story: what are the acceptance criteria? (use given/when/then format)
- What are the key scenarios — happy path, error cases, and edge cases? What pre-conditions, steps, and expected results define each?

**Arch Decisions questions — Technical Approach & Decisions**
- What is the proposed solution?
- What alternatives were considered and why were they rejected?
- What are the risks and how do we mitigate them?
- What key architectural or design decisions need to be made?
- For each decision: what is the context, the options, and the chosen approach?
- What is the implementation plan?

**Technical Contract questions — Interfaces & Boundaries**
- What interfaces, data models, or system boundaries does this feature define?
- What are the inputs and outputs?
- What are the integration points with other systems or modules?

### Phase 4: Writing

Create the file at `specs/<feature-name>.md` using the template in [references/template.md](references/template.md).

Rules:
- Use kebab-case for the filename (e.g., `specs/user-authentication.md`)
- Create the `specs/` directory at the project root if it doesn't exist
- Fill every section — no placeholders or TODOs
- Every user story must have acceptance criteria in given/when/then format
- Key Scenarios table must include at least one happy path, one error case, and one edge case
- Keep language direct and concise
- Use diagrams (mermaid) when they clarify flow or architecture

### Phase 5: Review & Deliver

After writing, present a summary of what was generated and ask if any section needs refinement.

Once the user is satisfied, open a Pull Request with **only** the spec file:

1. Create branch `spec/<feature-name>` from the base branch
2. Stage only `specs/<feature-name>.md` — no other files
3. Verify with `git status` that nothing else is staged before committing
4. Commit with message `spec: <feature-name>`
5. Push and create the PR:
- Title: `spec: <feature-name>`
- Body: summary of the spec contents

## Lifecycle

| Status | Meaning | Trigger |
|---|---|---|
| `Draft` | Written, awaiting review | Spec generated |
| `Approved` | Reviewed and accepted for implementation | User approves |
| `Done` | Fully implemented | Implementation complete |

Update the status in the document header as the specification progresses.

Once a spec is `Approved`, it is implemented using the **implementing** skill only — that skill loads the spec, reconnoiters the repo, runs a test-first implementation loop, verifies, and delivers (PR + status `Done`). No extra handoff artifact is required between specification and implementation.

## Important

- Never generate a section without having **sufficient information** for it — whether provided in the user's initial message, inferred from the repository context, or gathered through discovery questions
- Ask clarifying questions when answers are vague
- Adapt the number of Key Decisions to what the feature actually requires — don't force unnecessary decisions
- The Technical Contract section should be technology-agnostic unless the user specifies a stack
Loading