This project uses Polly.js for robust E2E testing. It records and replays HTTP interactions with upstream LLM providers, ensuring tests are fast, reliable, and run without API keys.
The E2E tests are split by API type:
- Chat API:
packages/backend/src/services/__tests__/e2e_vcr_chat.test.ts(Cases:cases/chat/) - Messages API:
packages/backend/src/services/__tests__/e2e_vcr_messages.test.ts(Cases:cases/messages/)
- Dynamic Discovery: Each suite loads
.jsonfiles from its respective directory incases/. - Cassette Recording (Polly.js):
- Requests to upstream providers are intercepted.
- In Record Mode, real API calls are made and saved to
__cassettes__/as JSON files. - In Replay Mode, the network is completely mocked using these saved JSON files.
- Validation: The test verifies that the
Dispatcherlogic correctly transforms the upstream data into a valid Unified response.
To ensure test isolation and prevent "mock pollution" in Bun's shared-worker environment, this project uses a global setup script.
The root bunfig.toml is configured to preload packages/backend/test/setup.ts before any tests run. This script establishes "Gold Standard" mocks for global dependencies like the Logger.
Bun's mock.module is a process-global operation. Once a module is mocked, it remains mocked for the duration of that worker thread, and mock.restore() does not reset it.
To prevent crashes in other tests (e.g., TypeError: logger.info is not a function), follow these rules:
- Use the Global Setup: Common modules like
src/utils/loggershould be mocked once insetup.ts. - Robust Mocking: If you must mock a module in a specific test file, your mock MUST implement the entire public interface of that module (including all log levels like
silly,debug, etc.). - Prefer Spying: If you need to assert that a global dependency was called, use
spyOnon the already-mocked global instance rather than re-mocking the module.
import { logger } from "src/utils/logger";
import { spyOn, expect, test } from "bun:test";
test("my test", () => {
const infoSpy = spyOn(logger, "info");
// ... run code ...
expect(infoSpy).toHaveBeenCalled();
});Uses existing cassettes. No API keys or network access are required.
cd packages/backend
bun testTip: You can also run this via the VS Code task Bun: Backend Tests.
To capture new network interactions for ALL tests:
# From project root
PLEXUS_TEST_API_KEY="your-openai-key" \
PLEXUS_TEST_ANTHROPIC_API_KEY="your-anthropic-key" \
bun run update-cassettesNote: The suite automatically scrubs sensitive headers and model names before saving to disk.
To test the full system manually (Frontend + Backend):
- Start the development stack:
bun dev
- Open the Dashboard at
http://localhost:4000. - Send requests to the API proxy at
http://localhost:4000/v1/...usingcurlortestcommands/test_request.ts.
The following environment variables are used during Record Mode:
| Variable | Description | Default |
|---|---|---|
PLEXUS_TEST_API_KEY |
Chat-compatible API Key. | scrubbed_key |
PLEXUS_TEST_ANTHROPIC_API_KEY |
Messages API Key. | scrubbed_key |
PLEXUS_TEST_BASE_URL |
Base URL for Chat provider. | https://api.upstream.mock/openai/v1 |
PLEXUS_TEST_ANTHROPIC_BASE_URL |
Base URL for Messages provider. | https://api.anthropic.com/v1 |
- Add a new JSON request body to
cases/chat/(for Chat-like) orcases/messages/(for Messages-like). - Run the Record Mode command above to capture the network interaction.
- Commit the new case and its corresponding cassette in
__cassettes__/.