Skip to content

feat: Achieve 100% Ring Multi-Tenant Standards Compliance #94

@jeffersonrodrigues92

Description

@jeffersonrodrigues92

Summary

Matcher currently meets 74% (20/27) of Ring multi-tenant standards. This issue tracks the implementation of the remaining 7 gaps to reach full compliance.

Audit report: docs/plans/2026-04-06-multi-tenant-100-compliance.md
Visual audit: ~/.agent/diagrams/matcher-multi-tenant-audit.html


Current State (20/27 Compliant)

All core isolation layers are implemented and working:

  • Tenant ID from JWT only (TenantExtractor middleware)
  • Schema-per-tenant via SET LOCAL search_path (ApplyTenantSchema)
  • Per-tenant connection pooling (TenantConnectionManager with singleflight + lease-based lifecycle)
  • Redis key scoping (ScopedRedisSegments)
  • Object storage path scoping (ScopedObjectStorageKey)
  • Tenant-scoped idempotency and rate limiting
  • Read replica tenant isolation with resetSearchPath safety
  • Systemplane runtime config (11 tenancy keys)
  • forbidigo + depguard linter enforcement
  • Integration test for cross-tenant isolation (TestCrossTenantIsolation_H03)

Gaps to Close (7 items)

Gap 1 — Per-Tenant Schema Migration Provisioning [HIGH]

Ring standard: Automated provisioning of tenant schemas when onboarded.
Current state: Only shared-schema migrations in internal/bootstrap/migrations.go. Integration tests manually create schemas via CREATE SCHEMA.
What to implement: PostgresTenantSchemaProvisioner that creates per-tenant schemas and runs all embedded migrations within them. Must be idempotent.

Agent execution steps (12 tasks, ~45 min)

Skill: /ring:dev-cycle with ring:backend-engineer-golang

  1. Define port interface — Add TenantSchemaProvisioner interface to internal/shared/ports/infrastructure.go:

    type TenantSchemaProvisioner interface {
        ProvisionTenantSchema(ctx context.Context, tenantID string) error
    }
  2. Write failing unit tests — Create internal/shared/infrastructure/tenant/adapters/schema_provisioner_test.go with tests for:

    • NewPostgresTenantSchemaProvisioner constructor validation (nil DSN, empty DB name, valid params)
    • Build tag: //go:build unit
  3. Implement adapter — Create internal/shared/infrastructure/tenant/adapters/schema_provisioner.go:

    • Constructor validates DSN and DB name
    • ProvisionTenantSchema() flow:
      1. Validate tenant ID is UUID via libCommons.IsUUID()
      2. Open DB connection to primary DSN
      3. CREATE SCHEMA IF NOT EXISTS using auth.QuoteIdentifier() for SQL injection prevention
      4. SET search_path TO "<tenantID>", public
      5. Create postgres.WithInstance() driver with SchemaName: tenantID
      6. Create iofs.New(migrations.FS, ".") source from embedded migrations
      7. Run migrator.Up() — handle migrate.ErrNoChange as success
    • Add OpenTelemetry span: infrastructure.tenant.provision_schema
    • Compile-time interface assertion: var _ ports.TenantSchemaProvisioner = (*PostgresTenantSchemaProvisioner)(nil)
  4. Write integration tests — Create tests/integration/tenant/schema_provisioner_test.go:

    • TestProvisionTenantSchema_CreatesSchemaAndTables: verify schema exists + tables created
    • TestProvisionTenantSchema_Idempotent: call twice, no error
    • TestProvisionTenantSchema_InvalidTenantID_ReturnsError: non-UUID rejected
    • Build tag: //go:build integration
  5. Wire in bootstrap — Modify internal/bootstrap/dynamic_infrastructure_provider.go:

    • Add schemaProvisioner field to dynamicInfrastructureProvider
    • Create provisioner from Config.Postgres DSN in buildTenantConnectionManagerFromConfig
  6. Verify: go test -tags unit ./internal/shared/infrastructure/tenant/adapters/... + make lint


Gap 2 — Event-Driven Tenant Discovery [MEDIUM]

Ring standard: Redis Pub/Sub with EventListener, TenantCache, TenantLoader.
Current state: Tenant config fetched synchronously via HTTP (RemoteConfigurationAdapter). No proactive discovery.
What to implement: RedisTenantEventListener that subscribes to tenant lifecycle events and triggers cache invalidation + schema provisioning.

Agent execution steps (10 tasks, ~35 min)

Skill: /ring:dev-cycle with ring:backend-engineer-golang
Depends on: Gap 1 (provisioner) + Gap 3 (cache)

  1. Define port types — Add to internal/shared/ports/infrastructure.go:

    type TenantEvent struct {
        Type     string `json:"type"`     // "tenant.created", "tenant.updated", "tenant.deleted"
        TenantID string `json:"tenantId"`
    }
    
    type TenantEventListener interface {
        Start(ctx context.Context) error
        Stop() error
    }
  2. Write failing unit tests — Create internal/shared/infrastructure/tenant/adapters/redis_event_listener_test.go:

    • Constructor validation (nil callback, empty channel, valid config)
  3. Implement listener — Create internal/shared/infrastructure/tenant/adapters/redis_event_listener.go:

    • RedisTenantEventListenerConfig with Channel + OnEvent callback
    • Start() subscribes to Redis Pub/Sub channel, loops on messages
    • handleMessage() deserializes JSON → TenantEvent, calls OnEvent
    • Panic recovery via runtime.RecoverAndLogWithContext
    • Stop() via sync.Once + stop channel
    • atomic.Bool prevents double-start
  4. Wire in bootstrap — Modify internal/bootstrap/dynamic_infrastructure_provider.go:

    • startTenantEventListener() creates listener with callback:
      • tenant.createdcache.Invalidate(tenantID) + provisioner.ProvisionTenantSchema(tenantID)
      • tenant.updatedcache.Invalidate(tenantID)
      • tenant.deletedcache.Invalidate(tenantID)
    • Launch via runtime.SafeGoWithContextAndComponent
  5. Add config — Add MultiTenantEventChannel to TenancyConfig in internal/bootstrap/config.go:

    MultiTenantEventChannel string `env:"MULTI_TENANT_EVENT_CHANNEL" envDefault:"matcher.tenant.events"`

    Empty value disables event-driven discovery (graceful degradation).

  6. Verify: go test -tags unit ./internal/shared/infrastructure/tenant/adapters/... + make lint


Gap 3 — Tenant Config Caching (TTL) [MEDIUM]

Ring standard: TTL-based cache layer for GetTenantConfig results.
Current state: RemoteConfigurationAdapter makes HTTP call per tenant. Connection pools cached but config itself is not.
What to implement: CachedConfigurationAdapter decorator with TTL + stale-while-revalidate.

Agent execution steps (8 tasks, ~25 min)

Skill: /ring:dev-cycle with ring:backend-engineer-golang

  1. Write failing unit tests — Create internal/shared/infrastructure/tenant/adapters/cached_configuration_test.go:

    • Constructor validation (nil delegate, zero TTL, valid params)
    • Cache hit test (2nd call returns cached, delegate called once)
    • TTL expiry test (sleep past TTL, delegate called again)
    • Stale-on-error test (delegate fails after cache populated → return stale entry)
    • Invalidate() test (removes entry, next call hits delegate)
    • InvalidateAll() test (clears everything)
  2. Implement decorator — Create internal/shared/infrastructure/tenant/adapters/cached_configuration.go:

    • Implements ports.ConfigurationPort
    • sync.RWMutex + map[string]*cachedEntry (entry = config + expiresAt)
    • GetTenantConfig(): check cache → if valid, return clone → if expired, fetch delegate → on error return stale → on success update cache
    • cloneTenantConfig() to prevent mutation of cached entries
    • Invalidate(tenantID) and InvalidateAll() for event-driven cache busting
  3. Wire in bootstrap — Modify internal/bootstrap/dynamic_infrastructure_provider.go:

    • After creating RemoteConfigurationAdapter, wrap in CachedConfigurationAdapter
    • Only when MultiTenantConfigCacheTTLSec > 0 (0 disables caching)
    • Pass wrapped configPort to NewTenantConnectionManager
  4. Add config — Add to TenancyConfig in internal/bootstrap/config.go:

    MultiTenantConfigCacheTTLSec int `env:"MULTI_TENANT_CONFIG_CACHE_TTL_SEC" envDefault:"300"`
  5. Verify: go test -tags unit ./internal/shared/infrastructure/tenant/adapters/... + make lint


Gap 6 — RabbitMQ Per-Tenant Routing [LOW]

Ring standard: Per-tenant queues/vhosts or tenant-scoped routing.
Current state: Shared matcher.events exchange. Tenant ID in message payload only.
What to implement: Tenant ID appended to routing keys. Topic exchange wildcards maintain backward compat.

Agent execution steps (8 tasks, ~25 min)

Skill: /ring:dev-cycle with ring:backend-engineer-golang

  1. Write failing tests — Create internal/shared/adapters/rabbitmq/routing_test.go:

    • TenantScopedRoutingKey("ingestion.completed", "uuid")"ingestion.completed.uuid"
    • TenantScopedRoutingKey("ingestion.completed", "")"ingestion.completed" (backward compat)
    • Whitespace tenant ID → unchanged key
  2. Implement helper — Create internal/shared/adapters/rabbitmq/routing.go:

    func TenantScopedRoutingKey(baseKey, tenantID string) string {
        tenantID = strings.TrimSpace(tenantID)
        if tenantID == "" { return baseKey }
        return baseKey + "." + tenantID
    }
  3. Update ingestion publisher — Modify internal/ingestion/adapters/rabbitmq/event_publisher.go:

    • Before publish, wrap routing key: scopedKey := sharedRabbitmq.TenantScopedRoutingKey(routingKey, auth.LookupTenantID(ctx))
  4. Update matching publisher — Same pattern in internal/matching/adapters/rabbitmq/event_publisher.go

  5. Verify: go test -tags unit ./internal/shared/adapters/rabbitmq/... + existing publisher tests + make lint

Migration note: Existing consumers using ingestion.completed must update bindings to ingestion.completed.* (wildcard) to receive all tenants.


Gap 7 — lib-commons TenantMiddleware Pattern [LOW]

Ring standard: Use lib-commons v4 TenantMiddleware with WithPG/WithMB variadic options.
Current state: Custom TenantExtractor — functionally equivalent but custom code.
What to implement: Compliance documentation (lib-commons v4.3.1 does not export TenantMiddleware).

Agent execution steps (6 tasks, ~20 min)

Skill: /ring:dev-cycle with ring:backend-engineer-golang

  1. Verify lib-commons API — Check if TenantMiddleware is exported in lib-commons v4.3.1 (it is not)

  2. Create compliance documentdocs/compliance/multi-tenant-middleware.md:

    • Status: COMPLIANT (Functional Equivalence)
    • Requirement-by-requirement mapping table showing all Ring middleware requirements pass
    • Migration path for when lib-commons exports the middleware
  3. Add compliance annotation — Comment block in internal/auth/middleware.go referencing the compliance doc

  4. Verify: make lint


Gaps 4 & 5 — M2M Credentials + Observability Metrics [LOW - CONDITIONAL]

Ring standard: M2MCredentialProvider with AWS Secrets Manager + L1/L2 cache + 6 mandatory metrics.
Current state: Matcher declares no targetServices. Does not call other Lerian services with per-tenant auth.
What to implement: Documented exemption (Ring standard explicitly says: "Skip this gate for services without targetServices").

Agent execution steps (4 tasks, ~10 min)

Skill: /ring:dev-cycle with ring:backend-engineer-golang

  1. Create exemption documentdocs/compliance/m2m-credentials.md:

    • Status: EXEMPT (No Target Services)
    • Table showing no service-to-service calls require M2M
    • Migration path for when M2M is needed
    • M2M metrics table marked N/A
  2. Verify: make lint


Execution Order (Dependency Graph)

Gap 3 (TTL cache) ← simplest, prerequisite for Gap 2
  ↓
Gap 1 (schema provisioning) ← highest priority, prerequisite for Gap 2
  ↓
Gap 2 (event-driven discovery) ← depends on Gap 1 + Gap 3
  ↓
Gap 6 (RabbitMQ routing) ← independent
  ↓
Gap 7 (middleware docs) ← independent, documentation only
  ↓
Gaps 4+5 (M2M docs) ← independent, documentation only

Recommended sequence: Gap 3 → Gap 1 → Gap 2 → Gap 6 → Gap 7 → Gaps 4+5


Agent Ring Execution Guide

Recommended Skills & Agents per Gap

Gap Skill Primary Agent Reviewer Agents
1 (Schema) /ring:dev-cycle ring:backend-engineer-golang All 7 reviewers via /ring:codereview
2 (Events) /ring:dev-cycle ring:backend-engineer-golang All 7 reviewers via /ring:codereview
3 (Cache) /ring:dev-cycle ring:backend-engineer-golang All 7 reviewers via /ring:codereview
6 (RabbitMQ) /ring:dev-cycle ring:backend-engineer-golang All 7 reviewers via /ring:codereview
7 (Middleware) /ring:dev-cycle ring:backend-engineer-golang ring:docs-reviewer
4+5 (M2M) /ring:dev-cycle ring:backend-engineer-golang ring:docs-reviewer

Gate Flow per Gap

Each gap follows the dev-cycle gates:

  1. Gate 0 — Implementation: ring:backend-engineer-golang writes code following TDD (RED → GREEN → REFACTOR)
  2. Gate 1 — DevOps: Verify make up works with new config env vars
  3. Gate 2 — SRE Validation: Verify OpenTelemetry spans are correct
  4. Gate 3 — Unit Tests: 85%+ coverage on new files
  5. Gate 4 — Fuzz Tests: Fuzz constructors and parsers (schema provisioner, event listener JSON)
  6. Gate 5 — Property Tests: Verify cache invariants (TTL expiry, stale-while-revalidate)
  7. Gate 6 — Integration Tests: Test with real Postgres/Redis via testcontainers
  8. Gate 7 — Chaos Tests: Toxiproxy for Redis Pub/Sub disconnection, Postgres schema creation timeout
  9. Gate 8 — Code Review: All 7 reviewers in parallel via /ring:codereview
  10. Gate 9 — Validation: All acceptance criteria met

Verification Commands

# After all gaps complete
make test-unit          # All unit tests pass
make test-int           # Integration tests pass (including new schema provisioner tests)
make lint               # 75+ linters pass
make sec                # No security issues
make check-tests        # Every .go has _test.go
make check-test-tags    # All test files have proper build tags

Estimated Effort

Gap Tasks Time
Gap 3 (Cache) 8 ~25 min
Gap 1 (Schema) 12 ~45 min
Gap 2 (Events) 10 ~35 min
Gap 6 (RabbitMQ) 8 ~25 min
Gap 7 (Middleware) 6 ~20 min
Gaps 4+5 (M2M) 4 ~10 min
Total 48 ~160 min

Files Summary

New Files (11)

  • internal/shared/infrastructure/tenant/adapters/cached_configuration.go
  • internal/shared/infrastructure/tenant/adapters/cached_configuration_test.go
  • internal/shared/infrastructure/tenant/adapters/schema_provisioner.go
  • internal/shared/infrastructure/tenant/adapters/schema_provisioner_test.go
  • internal/shared/infrastructure/tenant/adapters/redis_event_listener.go
  • internal/shared/infrastructure/tenant/adapters/redis_event_listener_test.go
  • internal/shared/adapters/rabbitmq/routing.go
  • internal/shared/adapters/rabbitmq/routing_test.go
  • tests/integration/tenant/schema_provisioner_test.go
  • docs/compliance/multi-tenant-middleware.md
  • docs/compliance/m2m-credentials.md

Modified Files (6)

  • internal/shared/ports/infrastructure.go — New port interfaces
  • internal/bootstrap/config.go — New config fields
  • internal/bootstrap/dynamic_infrastructure_provider.go — Wire cache, provisioner, event listener
  • internal/ingestion/adapters/rabbitmq/event_publisher.go — Tenant-scoped routing keys
  • internal/matching/adapters/rabbitmq/event_publisher.go — Tenant-scoped routing keys
  • internal/auth/middleware.go — Ring compliance annotation

Rollback Strategy

  • Each gap is separate commits — revert per-gap if needed
  • Feature flags: MULTI_TENANT_CONFIG_CACHE_TTL_SEC=0 disables cache, MULTI_TENANT_EVENT_CHANNEL="" disables event listener
  • Full rollback: git revert the merge commit

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions