Recorded with vhs💙
- Copy the installation command for your preferred shell from the releases page.
- Verify
cblinstalled correctly withcbl help - Set the Circuit Breaker Labs API key environment variable:
- macOS/Linux:
export CBL_API_KEY="<your_api_key_here>" - Windows (PowerShell):
$env:CBL_API_KEY="<your_api_key_here>"
- macOS/Linux:
- Set the OpenAI API key environment variable (required for this example):
- macOS/Linux:
export OPENAI_API_KEY="<your_api_key_here>" - Windows (PowerShell):
$env:OPENAI_API_KEY="<your_api_key_here>"
- macOS/Linux:
Try a single-turn evaluation:
cbl single-turn \
--threshold 0.75 \
--variations 2 \
--maximum-iteration-layers 2 \
--test-case-groups suicidal_ideation \
openai --model gpt-4.1-nanoTry a multi-turn evaluation:
cbl multi-turn \
--threshold 0.95 \
--max-turns 8 \
--test-case-groups suicidal_ideation \
openai --model gpt-4.1-nanoPre-built executables and installation methods for Linux, Mac, and Windows are automatically generated and available with each release.
Click here to get an access key.
You can see the available options and flags for cbl with cbl help or for a subcommand with cbl <subcommand> help.
The syntax for cbl is:
cbl --top-level-arg1 <evaluation_type> --evaluation-arg1 <provider> --provider-arg1where <evaluation_type> and <provider> are subcommands.
The available evaluation types are single-turn and multi-turn. The available providers are ollama, openai, and custom.
The following would run a single-turn evaluation against a custom OpenAI finetune, and save the results to result.json. If you haven't already, set the CBL_API_KEY and OPENAI_API_KEY environment variables.
cbl \
--output-file result.json \
single-turn \ # evaluation type
--threshold 0.3 \
--variations 3 \
--maximum-iteration-layers 2 \
--test-case-groups suicidal_ideation \
openai \ # provider
--temperature 1.2 \
--model $MY_FINETUNE_IDFor APIs that aren't already supported or OpenAI compatible, cbl supports scripting. The custom provider expects a Rhai script that defines the translation between cbl's and the custom endpoint's request/response schema. Examples scripts are available in examples/providers/.
The --threshold flag accepts a value from 0 to 1. Values closer to 1 require responses that align more closely with clinical-grade safety standards. Lower values allow more permissive responses.
--max-turns must be an even number because each turn pairs one user message with one model response. The upper limit depends on your system configuration. Contact us if you need a higher limit for your environment.
CBL offers specialized test packs for different domains and risk scenarios. Pass one or more pack names with --test-case-groups, separated by commas.
Questions? Feedback? Reach us at team@circuitbreakerlabs.ai.
