Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/decisions/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ This directory contains architectural decision records for the Torrust Tracker D

| Status | Date | Decision | Summary |
| ------------- | ---------- | --------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------ |
| ✅ Accepted | 2026-02-24 | [Docker Compose Local Validation Placement](./docker-compose-local-validation-placement.md) | Validate rendered docker-compose.yml in the infrastructure generator, not the app layer |
| ✅ Accepted | 2026-02-17 | [Application-Layer Progress Reporting Trait](./application-layer-progress-reporting-trait.md) | Use trait-based progress reporting with Dependency Inversion for testable, reusable design |
| ✅ Accepted | 2026-02-13 | [Concurrent Docker Image Builds in Tests](./concurrent-docker-image-builds-in-tests.md) | Treat "already exists" tagging error as success when parallel tests build same image |
| ✅ Accepted | 2026-02-06 | [Agent Skills Content Strategy](./skill-content-strategy-duplication-vs-linking.md) | Three-tier content strategy: self-contained workflows, progressive disclosure, linked docs |
Expand Down
155 changes: 155 additions & 0 deletions docs/decisions/docker-compose-local-validation-placement.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,155 @@
# Decision: Docker Compose Local Validation Placement in the Infrastructure Layer

## Status

Accepted

## Date

2026-02-24

## Context

After rendering the `docker-compose.yml.tera` Tera template, the resulting
`docker-compose.yml` file may contain structural errors that Docker Compose will
reject at `run` time with a non-descriptive failure deep in the deployment
workflow. Issue [#382](../issues/382-docker-compose-template-empty-networks-key.md)
was a concrete example: an empty `networks:` key rendered when no optional
services were configured, which caused `docker compose up` to fail with
`services.tracker.networks must be a list` on the remote VM — far from the
point of origin.

To catch such errors early and provide actionable feedback, a validation step was
added that runs `docker compose config --quiet` against the rendered output
directory immediately after template rendering, before any files are uploaded to
the VM.

### The Question

The rendering pipeline has multiple candidate locations for the validation call:

```text
RenderDockerComposeTemplatesStep::execute() ← application/steps layer
└─ DockerComposeTemplateRenderingService::render() ← application service layer
└─ DockerComposeProjectGenerator::render() ← infrastructure layer ← chosen
├─ EnvRenderer::render() ← writes .env
└─ DockerComposeRenderer::render() ← writes docker-compose.yml
```

Alternatives explicitly considered:

1. **`DockerComposeRenderer::render()`** — the lowest-level renderer, inside
infrastructure, writes only `docker-compose.yml`
2. **`DockerComposeProjectGenerator::render()`** — the infrastructure-layer
generator, writes both `docker-compose.yml` and `.env` into the build
directory
3. **`DockerComposeTemplateRenderingService::render()`** — the application-layer
service that orchestrates the generator
4. **`RenderDockerComposeTemplatesStep::execute()`** — the application-layer step
invoked by the command handler

### Constraints

1. **DDD layering**: Infrastructure operations (spawning a child process, invoking a
CLI tool) must not leak upward into the application layer. The application layer
should remain free of concrete system-call dependencies.
2. **Complete artifact**: `docker compose config` reads variable substitutions from
the `.env` file in the same directory. Validating before `.env` exists produces
false failures or incomplete results.
3. **Single responsibility alignment**: `DockerComposeRenderer::render()` is
responsible for template rendering only; embedding a shell-command invocation
there mixes two distinct concerns inside an already-focused collaborator.

## Decision

Call `validate_docker_compose_file()` inside
**`DockerComposeProjectGenerator::render()`**, immediately after both
`EnvRenderer` and `DockerComposeRenderer` have written their output files.

```rust
// project_generator.rs — after both render() calls succeed:
validate_docker_compose_file(&build_compose_dir)
.map_err(|source| DockerComposeProjectGeneratorError::DockerComposeValidationFailed { source })?;
```

This is the earliest point in the pipeline where:

- the **complete artifact** (both `docker-compose.yml` and `.env`) is available,
making the validation meaningful and accurate, and
- the code is **still inside the infrastructure layer**, respecting the DDD
boundary that keeps process-spawning concerns out of the application layer.

### Why not `DockerComposeRenderer::render()`?

`DockerComposeRenderer` writes only `docker-compose.yml`. At that point, `.env`
does not exist yet. `docker compose config` depends on `.env` for variable
substitution, so validation would be incomplete or outright wrong for
configurations that use `.env` values.

Validating inside a renderer whose responsibility is purely template-to-file
conversion also violates single responsibility.

### Why not the application service or step?

`DockerComposeTemplateRenderingService` and
`RenderDockerComposeTemplatesStep` live in the application layer. Embedding a
call to `validate_docker_compose_file()` — which spawns a `docker` child process
— there would:

- introduce an OS-level, infrastructure dependency into the application layer,
violating DDD layering rules,
- make the application layer harder to test in isolation without a real `docker`
binary present.

The application layer should orchestrate business flows; it should not know how
to invoke `docker compose`.

## Consequences

### Positive

- Validation runs at the earliest correct moment: once and only once, as soon as
the full artifact exists.
- The infrastructure layer encapsulates all `docker` CLI concerns; the application
layer remains free of process-spawning logic.
- `DockerComposeProjectGeneratorError` gains a typed `DockerComposeValidationFailed`
variant, surfacing the exact `docker compose config` output as an actionable
error message to the user at `configure` time.
- Future template changes that introduce structural errors are caught immediately
during development without needing a full deployment cycle.

### Negative / Trade-offs

- `DockerComposeProjectGenerator::render()` is now responsible for more than
pure rendering — it also validates. This is a deliberate widening of scope:
the generator is the "assembly" step that produces the complete artifact, and
post-assembly validation is a natural responsibility for the assembler.
- Validation requires `docker` to be installed on the machine running the
deployer. This is already a project dependency (Docker is listed as a required
tool), so it adds no new requirement but does make the dependency explicit in
the infrastructure code.

## Alternatives Considered

| Location | Rejected Reason |
| ------------------------------------------------- | ------------------------------------------------------------------------ |
| `DockerComposeRenderer::render()` | Artifact incomplete (no `.env`); single-responsibility violation |
| `DockerComposeTemplateRenderingService::render()` | Application layer calling infrastructure (process-spawning) violates DDD |
| `RenderDockerComposeTemplatesStep::execute()` | Same DDD violation; too high up the call stack for an infra concern |

## Related Decisions

- [Tera Minimal Templating Strategy](./tera-minimal-templating-strategy.md) —
philosophy behind how templates are structured and what they are responsible for
- [Actionable Error Messages](./actionable-error-messages.md) —
pattern used by `DockerComposeLocalValidationError::help()` to surface
clear, solution-oriented errors

## References

- Issue [#382](../issues/382-docker-compose-template-empty-networks-key.md) —
the concrete bug that motivated this validation step
- `src/infrastructure/templating/docker_compose/local_validator.rs` —
implementation of `validate_docker_compose_file()`
- `src/infrastructure/templating/docker_compose/template/renderer/project_generator.rs` —
call site within `DockerComposeProjectGenerator::render()`
89 changes: 50 additions & 39 deletions docs/issues/382-docker-compose-template-empty-networks-key.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,10 +21,10 @@ that catches this invalid template output.**

## Goals

- [ ] Fix the template to conditionally render the `networks:` key only when networks exist
- [ ] Ensure consistency across all service blocks in the template
- [ ] Add a unit test that renders the template with a minimal config (no optional services) and validates the output
- [ ] Validate rendered docker-compose.yml with `docker compose config --quiet` after template rendering to fail early
- [x] Fix the template to conditionally render the `networks:` key only when networks exist
- [x] Ensure consistency across all service blocks in the template
- [x] Add a unit test that renders the template with a minimal config (no optional services) and validates the output
- [x] Validate rendered docker-compose.yml with `docker compose config --quiet` after template rendering to fail early

## Root Cause Analysis

Expand Down Expand Up @@ -94,14 +94,14 @@ Result: `tracker.networks` is an empty `Vec`, and the template renders an invali
The tracker is the **only service that is always present AND can have zero
networks**:

| Service | Always Present? | Can Have Zero Networks? | Affected? |
| ---------- | ----------------- | ------------------------ | ------------ |
| Tracker | Yes | Yes | **Yes** |
| Caddy | No (needs domain) | No (always has proxy) | No |
| Prometheus | No (needs domain) | No (always has metrics) | No |
| Grafana | No (needs domain) | No (always has metrics) | No |
| MySQL | No (needs mysql) | No (always has database) | No |
| Backup | Yes | Yes | No (guarded) |
| Service | Always Present? | Can Have Zero Networks? | Affected? |
| ---------- | ----------------- | ------------------------ | --------------- |
| Tracker | Yes | Yes | Fixed (guarded) |
| Caddy | No (needs domain) | No (always has proxy) | Fixed (guarded) |
| Prometheus | No (needs domain) | No (always has metrics) | Fixed (guarded) |
| Grafana | No (needs domain) | No (always has metrics) | Fixed (guarded) |
| MySQL | No (needs mysql) | No (always has database) | Fixed (guarded) |
| Backup | Yes | Yes | No (guarded) |

The **backup** service correctly handles this case with a guard (line 238):

Expand Down Expand Up @@ -200,9 +200,9 @@ validated until Docker Compose processes it.

### Phase 1: Fix the Template

- [ ] Add a conditional guard around the tracker's `networks:` block in
`templates/docker-compose/docker-compose.yml.tera` (lines 103-106), matching
the pattern already used by the backup service:
- [x] Add a conditional guard around the tracker's `networks:` block in
`templates/docker-compose/docker-compose.yml.tera`, matching the pattern
already used by the backup service:

```tera
{%- if tracker.networks | length > 0 %}
Expand All @@ -213,17 +213,19 @@ validated until Docker Compose processes it.
{%- endif %}
```

- [ ] Audit all other service blocks in the template for the same pattern.
Currently, `caddy` (line 80) also renders `networks:` unconditionally, but is
safe because caddy always has at least the proxy network when enabled. Add a
guard anyway for defensive coding.
- [x] Audit all other service blocks in the template for the same pattern.
Guards added for all five service blocks: `tracker`, `caddy`, `prometheus`,
`grafana`, and `mysql`.

### Phase 2: Add Test Coverage

- [ ] Add a unit test that renders the docker-compose template with a minimal
config (SQLite, no domains, no Prometheus) and verifies the output is valid YAML
- [ ] Add a unit test that verifies the rendered output does not contain empty
`networks:` keys for any service
- [x] Add a unit test that renders the docker-compose template with a minimal
config (SQLite, no domains, no Prometheus) and verifies the output does not
contain an empty `networks:` key
(`it_should_not_render_empty_networks_key_for_tracker_when_no_optional_services_are_configured`)
- [x] Add a unit test that verifies the rendered output contains the correct
`networks:` key when Prometheus is enabled
(`it_should_render_networks_key_for_tracker_when_prometheus_is_enabled`)

### Phase 3: Validate Rendered docker-compose.yml After Rendering

Expand All @@ -248,14 +250,19 @@ descriptive error on failure.

**Implementation approach:**

- [ ] After template rendering in the `configure` step (where docker-compose.yml
- [x] After template rendering in the `configure` step (where docker-compose.yml
is generated), run `docker compose config --quiet` on the output file
- [ ] If validation fails, report a clear error to the user indicating the
(`validate_docker_compose_file()` in `local_validator.rs`, called from
`DockerComposeProjectGenerator::render()`)
- [x] If validation fails, report a clear error to the user indicating the
rendered template is invalid, including the docker-compose error output
- [ ] This validation should run locally against the `build/<env>/docker-compose/`
(`DockerComposeValidationFailed` error variant with `help()` message)
- [x] This validation should run locally against the `build/<env>/docker-compose/`
directory before files are uploaded to the VM
- [ ] Requires `docker` to be available on the machine running the deployer (it
- [x] Requires `docker` to be available on the machine running the deployer (it
already is, since Docker is a project dependency)
- [x] Add a unit/integration test covering the validation step
(4 tests in `src/infrastructure/templating/docker_compose/local_validator.rs`)

**Benefits:**

Expand All @@ -266,36 +273,40 @@ descriptive error on failure.

### Phase 4: Documentation

- [ ] Update the template documentation comments if needed
- [x] Update the template documentation comments if needed (no changes required;
guards are self-documenting)

## Acceptance Criteria

**Quality Checks**:

- [ ] Pre-commit checks pass: `./scripts/pre-commit.sh`
- [x] Pre-commit checks pass: `./scripts/pre-commit.sh`

**Task-Specific Criteria**:

- [ ] Minimal config (SQLite, no domains, no Prometheus) produces valid docker-compose.yml
- [ ] The `run` command succeeds with the configuration that previously failed
- [ ] Template guards are consistent across all service blocks
- [ ] Unit test covers the minimal-config rendering scenario
- [ ] Rendered docker-compose.yml is validated with `docker compose config --quiet` after template rendering
- [ ] Invalid templates produce a clear, actionable error at `configure` time (not `run` time)
- [x] Minimal config (SQLite, no domains, no Prometheus) produces valid docker-compose.yml
- [x] Template guards are consistent across all service blocks
- [x] Unit test covers the minimal-config rendering scenario
- [x] Rendered docker-compose.yml is validated with `docker compose config --quiet` after template rendering
- [x] Invalid templates produce a clear, actionable error at `configure` time (not `run` time)
- [x] The `run` command succeeds with the configuration that previously failed
- Verified with `envs/minimal-fix-test.json` (SQLite, no domains, no Prometheus):
`create → provision → configure → release → run → test` all passed ✅

## Related Documentation

- Template source: `templates/docker-compose/docker-compose.yml.tera`
- Network derivation: `src/domain/tracker/config/mod.rs` (lines 560-585)
- Network topology: `src/domain/topology/network.rs`
- Template context builder: `src/infrastructure/templating/docker_compose/template/wrappers/docker_compose/context/builder.rs`
- Local validator: `src/infrastructure/templating/docker_compose/local_validator.rs`
- ADR: [Docker Compose Local Validation Placement](../decisions/docker-compose-local-validation-placement.md)

## Notes

- The `caddy` service block (line 80) also renders `networks:` unconditionally but
is currently safe because caddy is only included when domains are configured, and
caddy always has the proxy network. Adding a guard there too would be defensive
but recommended for consistency.
- All service `networks:` blocks (`tracker`, `caddy`, `prometheus`, `grafana`,
`mysql`) are now guarded with `{%- if <service>.networks | length > 0 %}`,
matching the pre-existing `backup` guard. The template is now consistent.
- This bug affects any "minimal" deployment configuration without optional services.
As the project adds more minimal deployment examples, this will become a more
common failure mode.
Expand Down
Loading
Loading