Skip to content

Add testnet support#185

Closed
neekolas wants to merge 1 commit intootr-devfrom
04-15-otr-testnet
Closed

Add testnet support#185
neekolas wants to merge 1 commit intootr-devfrom
04-15-otr-testnet

Conversation

@neekolas
Copy link
Copy Markdown
Collaborator

@neekolas neekolas commented Apr 15, 2026

Note

Add testnet environment support to deployment pipeline and invite URL routing

  • Adds an otr-testnet branch, workflow_dispatch option, and deploy-otr-testnet repository dispatch type to deploy-aws.yml, deploying to the convos-otr-testnet Terraform workspace.
  • Adds a verify-testnet CI job in s3-lifecycle-verify.yml that checks S3 lifecycle canaries against TESTNET_API_URL.
  • Updates buildInviteUrl in join.ts to map XMTP_ENV=testnet to https://testnet.convos.org instead of the dev domain.
  • Centralizes environment checks into isXmtpProduction() and shouldUseDevBehavior() helpers in config.ts, replacing scattered process.env.XMTP_ENV comparisons.
  • Documents a new testnet promotion stage in RELEASE.md: production deploys must now fast-forward through otr-testnet before reaching otr-prod.

Macroscope summarized 8132287. (Automatic summaries will resume when PR exits draft mode or review begins).

Summary by CodeRabbit

Release Notes

  • New Features

    • Testnet environment is now available for deployments with automated S3 lifecycle verification and health checks.
  • Chores

    • Updated deployment workflows to support testnet branch and environment selection.
    • Extended release process with explicit testnet stage before production deployment.

Copy link
Copy Markdown
Collaborator Author

This stack of pull requests is managed by Graphite. Learn more about stacking.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 15, 2026

Walkthrough

This PR extends the deployment infrastructure to support a testnet environment as an intermediate stage between development and production. Changes include new GitHub Actions jobs for testnet deployment and lifecycle verification, updated release procedures, refactored environment checks into centralized helper functions, and corresponding test coverage.

Changes

Cohort / File(s) Summary
GitHub Actions Workflows
.github/workflows/deploy-aws.yml, .github/workflows/s3-lifecycle-verify.yml
Added testnet deployment and lifecycle test jobs in deploy-aws.yml; added testnet verification job in s3-lifecycle-verify.yml. Updated workflow triggers to include otr-testnet branch and repository_dispatch type.
Release Documentation
RELEASE.md
Added Testnet release stage with git merge instructions between otr-dev and otr-testnet. Updated Production stage to merge from otr-testnet instead of otr-dev. Added rule requiring production commits to pass through testnet.
Environment Configuration
src/config.ts
Added two exported helper functions: isXmtpProduction() and shouldUseDevBehavior() for centralized environment checks.
API Route Configuration
src/api/v2/index.ts
Replaced direct process.env.XMTP_ENV check with shouldUseDevBehavior() call for dev routes registration.
Handler Functions
src/api/v2/agents/handlers/join.ts, src/api/v2/notifications/handlers/webhook.ts
Exported buildInviteUrl() with optional environment parameter; added shouldAllowForcedErrors() helper; replaced hardcoded environment checks with centralized helper functions.
Test Coverage
tests/agent-join.test.ts, tests/config-environment.test.ts
Added test suites for new environment helper functions and invite URL generation with testnet domain mapping.

Sequence Diagram(s)

sequenceDiagram
    participant GHA as GitHub Actions
    participant Terraform as Terraform
    participant API as API Server
    participant S3 as S3 Lifecycle
    
    GHA->>Terraform: deploy_otr_testnet (with testnet workspace)
    Terraform->>API: Deploy API with testnet config
    GHA->>API: test_lifecycle_testnet: GET /healthcheck
    API-->>GHA: 200 OK (ready)
    GHA->>API: POST /api/v2/assets/test/lifecycle
    API->>S3: Renew lifecycle records
    S3-->>API: Renewal complete
    API-->>GHA: 200 + success: true
    GHA->>API: verify-testnet: GET /api/v2/assets/test/lifecycle-status
    API-->>GHA: 200 + status: healthy
    GHA->>GHA: Write summary with lifecycle metrics
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • PR #160: Extends S3 lifecycle CI and verification workflows; both add testnet variants to the same GitHub Actions workflows.
  • PR #159: Implements the asset lifecycle test and status endpoints (/api/v2/assets/test/lifecycle, /api/v2/assets/test/lifecycle-status) that the new workflow jobs invoke.
  • PR #166: Modifies src/api/v2/index.ts and asset lifecycle routing; both add environment-aware route configuration.

Poem

🐰 Through dev to testnet's burrow deep,
Then prod with guards to watch and keep!
Environment helpers, clear and true,
Deployment flows now pass straight through! 🌟
One hop, two hops—the pipeline hops with glee,
A testnet waystation for quality to be! ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 16.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Add testnet support' directly aligns with the main objective of the pull request, which comprehensively introduces testnet support across workflows, configuration, and deployment infrastructure.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch 04-15-otr-testnet

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@claude
Copy link
Copy Markdown

claude bot commented Apr 15, 2026

Claude finished @neekolas's task —— View job


PR Review: Add testnet support

I've completed a comprehensive review of this PR. Overall, the implementation is solid and follows good practices, but there are several important issues that should be addressed.


✅ Strengths

  1. Clear architectural intent: The testnet environment as an intermediate stage between dev and prod is well-conceived
  2. Good test coverage: New helper functions have corresponding unit tests
  3. Documentation: RELEASE.md clearly documents the new deployment workflow
  4. Centralized logic: The environment check consolidation into isXmtpProduction() and shouldUseDevBehavior() reduces scattered conditionals

🔴 Critical Issues

1. Production deployment not gated on testnet ancestry (.github/workflows/deploy-aws.yml:124-147)

Issue: The deploy_otr_prod job still only depends on push_to_registry (line 124), meaning production can deploy commits that never went through testnet. This leaves the new release rule in RELEASE.md unenforced by CI.

Impact: High - violates the stated deployment policy

Recommendation: Add a guard job that verifies the commit exists on otr-testnet branch and make deploy_otr_prod depend on it:

verify_prod_commit_on_testnet:
  name: Verify prod commit exists on otr-testnet
  runs-on: ubuntu-latest
  if: (inputs.ref == 'otr-prod') || (inputs.environment == 'otr-prod') || (github.ref == 'refs/heads/otr-prod' && !inputs.ref && !inputs.environment) || (github.event_name == 'repository_dispatch' && github.event.client_payload.ref == 'otr-prod')
  steps:
    - name: Checkout target ref
      uses: actions/checkout@v4
      with:
        fetch-depth: 0
        ref: ${{ github.event_name == 'repository_dispatch' && github.event.client_payload.ref || github.event_name == 'workflow_dispatch' && inputs.environment || github.event_name == 'workflow_call' && inputs.ref || github.ref }}
    - name: Verify ancestry from otr-testnet
      run: |
        git fetch origin otr-testnet --depth=1
        candidate_sha="$(git rev-parse HEAD)"
        if ! git merge-base --is-ancestor "$candidate_sha" "origin/otr-testnet"; then
          echo "::error::Commit $candidate_sha is not present on otr-testnet"
          exit 1
        fi

deploy_otr_prod:
  name: Deploy to OTR Production
  runs-on: ubuntu-latest
  needs: [push_to_registry, verify_prod_commit_on_testnet]  # Add the guard

2. Prettier formatting violation (src/api/v2/agents/handlers/join.ts:86-88)

Issue: The forceError ternary assignment violates Prettier formatting rules.

Current code:

const forceError = shouldAllowForcedErrors()
  ? req.headers["x-force-error"]
  : undefined;

Should be:

const forceError = shouldAllowForcedErrors()
  ? req.headers["x-force-error"]
  : undefined;

Impact: Medium - will fail CI linting checks


🟡 Important Issues

3. Type safety issue with domain mapping (src/api/v2/agents/handlers/join.ts:40-47)

Issue: The domainByEnv record type is overly restrictive. It's typed as Record<Extract<XmtpEnv, "production" | "testnet">, string> but then accessed with domainByEnv[xmtpEnv] where xmtpEnv could be "dev", causing a potential type error.

Current code:

const domainByEnv: Record<
  Extract<XmtpEnv, "production" | "testnet">,
  string
> = {
  production: "popup.convos.org",
  testnet: "testnet.convos.org",
};
const domain = xmtpEnv === "dev" ? "dev.convos.org" : domainByEnv[xmtpEnv];

Recommendation: Simplify to a safer pattern:

const domainByEnv: Record<XmtpEnv, string> = {
  production: "popup.convos.org",
  testnet: "testnet.convos.org",
  dev: "dev.convos.org",
};
const domain = domainByEnv[xmtpEnv];

This is more maintainable and type-safe.

4. Missing test coverage for workflow conditions

Issue: While the new helper functions have unit tests, there are no tests verifying the actual workflow behavior (e.g., that testnet uses shouldUseDevBehavior correctly for the force-error feature).

Recommendation: Add integration or E2E tests that verify:

  • Force errors work in testnet environment
  • Force errors are blocked in production environment
  • Invite URLs are generated with correct domains per environment

💡 Suggestions (Nice to Have)

5. Workflow duplication could be reduced

Issue: The three lifecycle test jobs (test_lifecycle_dev, test_lifecycle_testnet, test_lifecycle_prod) and the three verification jobs in s3-lifecycle-verify.yml are nearly identical.

Recommendation: Consider using GitHub Actions matrix strategy to reduce duplication:

test_lifecycle:
  name: Test S3 Lifecycle Renewal
  runs-on: ubuntu-latest
  needs: ${{ matrix.deploy_job }}
  strategy:
    matrix:
      include:
        - env: dev
          api_url_var: DEV_API_URL
          deploy_job: deploy_otr_dev
        - env: testnet
          api_url_var: TESTNET_API_URL
          deploy_job: deploy_otr_testnet
        - env: prod
          api_url_var: PROD_API_URL
          deploy_job: deploy_otr_prod
  steps:
    # ... use ${{ vars[matrix.api_url_var] }}

This would make future changes easier to maintain.

6. Consider making helper functions use object parameters

Note: CodeRabbit suggested this, and while it's not strictly necessary, it aligns with modern TypeScript best practices.

Current:

export function isXmtpProduction(xmtpEnv: XmtpEnv = XMTP_ENV): boolean

Suggested:

export function isXmtpProduction({ xmtpEnv = XMTP_ENV }: { xmtpEnv?: XmtpEnv } = {})

This pattern makes it easier to add optional parameters in the future and improves call-site clarity. However, for simple functions with one parameter, the current approach is acceptable.


🔒 Security

No security concerns identified. The testnet environment correctly inherits dev-like behavior for testing features (like forced errors) while maintaining production's restrictions.


⚡ Performance

No performance concerns. The environment checks are simple boolean comparisons with cached environment variables.


📊 Summary

Category Status Notes
Code Quality 🟡 Good Minor formatting issue to fix
Bugs 🔴 Critical Production deployment not properly gated
Security ✅ Excellent No issues
Performance ✅ Excellent No issues
Test Coverage 🟡 Good Unit tests present, integration tests would be beneficial

Recommendation: Address the critical issues (#1 and #2) before merging. The other suggestions can be addressed in follow-up PRs if desired.


Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (3)
RELEASE.md (1)

1-1: Rename this file to kebab-case for repository consistency.

Line 1 is inside RELEASE.md, but the filename itself is uppercase and does not follow kebab-case (for example, release.md).

As per coding guidelines, "Follow consistent file naming (kebab-case)".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@RELEASE.md` at line 1, The file name RELEASE.md is not in kebab-case; rename
RELEASE.md to release.md to comply with repository naming conventions, use git
mv to preserve history, and update any references to "RELEASE.md" in docs,
READMEs, CI configs, or build scripts (search for the string RELEASE.md) so they
point to "release.md"; commit the rename with a clear message like "rename
RELEASE.md to release.md" to complete the change.
.github/workflows/deploy-aws.yml (1)

201-236: Lifecycle test jobs are a good candidate for matrix refactor.

test_lifecycle_dev, test_lifecycle_testnet, and test_lifecycle_prod are near-identical; converting to a matrix (env + URL + needs) would simplify future changes.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/deploy-aws.yml around lines 201 - 236, Replace the three
near-identical jobs test_lifecycle_dev, test_lifecycle_testnet, and
test_lifecycle_prod with a single matrix job (e.g., job name test_lifecycle)
that iterates over environments; define matrix.include entries for each
environment containing the environment label (dev/testnet/prod), the
corresponding API URL variable (e.g., TESTNET_API_URL), the required needs value
(deploy_otr_dev/testnet/prod) and the matching secret token name
(LIFECYCLE_DEV_TOKEN / LIFECYCLE_TEST_TOKEN / LIFECYCLE_PROD_TOKEN); update the
job to reference matrix.env and matrix.url (and matrix.token via secrets) and
set needs: matrix.needs so each matrix axis waits for its corresponding
deployment job, keeping the existing healthcheck and curl steps unchanged but
parameterized by the matrix values (replace explicit references to
test_lifecycle_testnet, ${ { vars.TESTNET_API_URL } }, and ${ {
secrets.LIFECYCLE_TEST_TOKEN } } with matrix.url and secrets[matrix.token]).
.github/workflows/s3-lifecycle-verify.yml (1)

81-151: Consider deduplicating the env-specific verification jobs.

This block is almost identical to dev/prod. A matrix job (env + API URL + title) would reduce maintenance drift risk.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/s3-lifecycle-verify.yml around lines 81 - 151, The
workflow duplicates nearly identical jobs (verify-testnet, verify-dev,
verify-prod) — replace them with a single matrix job that iterates over
environments and injects the env-specific values (e.g.,
TESTNET_API_URL/secrets.LIFECYCLE_TEST_TOKEN) so maintenance is centralized;
create a matrix entry (env, api_url, token_secret, title) and update the step
that posts to "${{ vars.TESTNET_API_URL }}/api/..." to use the matrix
api_url/token values, keep the step logic that parses response and writes to
GITHUB_STEP_SUMMARY unchanged, and remove the separate verify-testnet job so the
new matrix job (e.g., job name verify-lifecycle) covers all environments.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/deploy-aws.yml:
- Around line 143-147: The production deploy job (deploy_otr_prod) is still only
depending on push_to_registry, allowing prod to run without testnet ancestry;
update the workflow so deploy_otr_prod depends on the testnet gate job
(deploy_otr_testnet or a new guard job) instead of or in addition to
push_to_registry and ensure the guard job's condition verifies the
ref/environment ancestry (use the existing deploy_otr_testnet job name or create
a guard job and add it to deploy_otr_prod's needs field) so production deploys
cannot run unless the testnet gate passes.

In `@src/api/v2/agents/handlers/join.ts`:
- Around line 79-80: The ternary assignment to forceError violates Prettier
formatting; reformat the declaration so the ternary is on one line and properly
spaced: assign forceError using the result of shouldAllowForcedErrors() ?
req.headers["x-force-error"] : undefined, referencing the existing symbols
forceError, shouldAllowForcedErrors, and req.headers["x-force-error"] to locate
and fix the statement so it passes the project's Prettier checks.
- Around line 35-46: Refactor buildInviteUrl and shouldAllowForcedErrors to use
object parameters and rely on TypeScript's inferred return types: change
buildInviteUrl to buildInviteUrl({ slug, xmtpEnv = XMTP_ENV }) and change
shouldAllowForcedErrors to shouldAllowForcedErrors({ xmtpEnv = XMTP_ENV } = {}),
but when invoking shouldUseDevBehavior inside shouldAllowForcedErrors call it
positionally as shouldUseDevBehavior(xmtpEnv) (not shouldUseDevBehavior({
xmtpEnv })). Update all call sites and tests to call buildInviteUrl({ slug })
and shouldAllowForcedErrors() (or pass an object when providing xmtpEnv).

In `@src/config.ts`:
- Around line 40-46: Refactor isXmtpProduction and shouldUseDevBehavior to
accept a single object parameter (e.g., { xmtpEnv = XMTP_ENV }) and remove
explicit boolean return annotations so TypeScript infers the return type; update
all call sites (including tests and src/api/v2/agents/handlers/join.ts) to pass
an object with the xmtpEnv property instead of a positional argument, and ensure
shouldUseDevBehavior calls the refactored isXmtpProduction with the same object
parameter shape.

---

Nitpick comments:
In @.github/workflows/deploy-aws.yml:
- Around line 201-236: Replace the three near-identical jobs test_lifecycle_dev,
test_lifecycle_testnet, and test_lifecycle_prod with a single matrix job (e.g.,
job name test_lifecycle) that iterates over environments; define matrix.include
entries for each environment containing the environment label
(dev/testnet/prod), the corresponding API URL variable (e.g., TESTNET_API_URL),
the required needs value (deploy_otr_dev/testnet/prod) and the matching secret
token name (LIFECYCLE_DEV_TOKEN / LIFECYCLE_TEST_TOKEN / LIFECYCLE_PROD_TOKEN);
update the job to reference matrix.env and matrix.url (and matrix.token via
secrets) and set needs: matrix.needs so each matrix axis waits for its
corresponding deployment job, keeping the existing healthcheck and curl steps
unchanged but parameterized by the matrix values (replace explicit references to
test_lifecycle_testnet, ${ { vars.TESTNET_API_URL } }, and ${ {
secrets.LIFECYCLE_TEST_TOKEN } } with matrix.url and secrets[matrix.token]).

In @.github/workflows/s3-lifecycle-verify.yml:
- Around line 81-151: The workflow duplicates nearly identical jobs
(verify-testnet, verify-dev, verify-prod) — replace them with a single matrix
job that iterates over environments and injects the env-specific values (e.g.,
TESTNET_API_URL/secrets.LIFECYCLE_TEST_TOKEN) so maintenance is centralized;
create a matrix entry (env, api_url, token_secret, title) and update the step
that posts to "${{ vars.TESTNET_API_URL }}/api/..." to use the matrix
api_url/token values, keep the step logic that parses response and writes to
GITHUB_STEP_SUMMARY unchanged, and remove the separate verify-testnet job so the
new matrix job (e.g., job name verify-lifecycle) covers all environments.

In `@RELEASE.md`:
- Line 1: The file name RELEASE.md is not in kebab-case; rename RELEASE.md to
release.md to comply with repository naming conventions, use git mv to preserve
history, and update any references to "RELEASE.md" in docs, READMEs, CI configs,
or build scripts (search for the string RELEASE.md) so they point to
"release.md"; commit the rename with a clear message like "rename RELEASE.md to
release.md" to complete the change.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 83371547-fde9-404e-8e5c-0d9520c5a3b0

📥 Commits

Reviewing files that changed from the base of the PR and between 1362f70 and 8132287.

⛔ Files ignored due to path filters (1)
  • bun.lock is excluded by !**/*.lock
📒 Files selected for processing (9)
  • .github/workflows/deploy-aws.yml
  • .github/workflows/s3-lifecycle-verify.yml
  • RELEASE.md
  • src/api/v2/agents/handlers/join.ts
  • src/api/v2/index.ts
  • src/api/v2/notifications/handlers/webhook.ts
  • src/config.ts
  • tests/agent-join.test.ts
  • tests/config-environment.test.ts

Comment thread .github/workflows/deploy-aws.yml
Comment thread src/api/v2/agents/handlers/join.ts Outdated
Comment thread src/api/v2/agents/handlers/join.ts Outdated
Comment thread src/config.ts Outdated
@neekolas neekolas force-pushed the 04-15-otr-testnet branch from 8132287 to f109dd7 Compare April 15, 2026 21:51
@neekolas
Copy link
Copy Markdown
Collaborator Author

Superseded by the split Graphite stack: #186, #187, #188, #189.

@neekolas neekolas closed this Apr 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant