Skip to content

Conversation

@jawhnycooke
Copy link
Contributor

@jawhnycooke jawhnycooke commented Nov 29, 2025

Summary

This PR adds comprehensive per-user token quota monitoring with fine-grained controls, and updates the EPCC workflow slash commands with improved multi-session support.


Part 1: Quota Monitoring

Enables administrators to manage costs and prevent unexpected overages.

Key Features

  • Per-user/group quota policies: Set different limits for users, groups, or defaults with clear precedence rules
  • Multiple limit types: Monthly tokens, daily tokens (with burst buffer), and cost-based limits (USD)
  • Enforcement modes: alert (notify only) or block (deny credential access)
  • Real-time quota API: JWT-authenticated endpoint for credential-time quota checks
  • Browser notifications: Visual warning/blocked pages when users approach or exceed limits
  • Periodic re-check: Configurable interval (default 30 min) to close the 12-hour credential cache gap
  • Bill shock protection: Auto-calculated daily limits prevent runaway usage

CLI Commands Added

# Set quotas
ccwb quota set-user [email protected] --monthly-limit 500M --daily-limit 20M
ccwb quota set-group engineering --monthly-limit 400M --enforcement block
ccwb quota set-default --monthly-limit 225M

# Manage quotas
ccwb quota list
ccwb quota show [email protected]
ccwb quota usage [email protected]
ccwb quota delete user [email protected]
ccwb quota unblock [email protected] --duration 24h

Infrastructure Changes

  • New quota-monitoring.yaml CloudFormation stack with:
    • DynamoDB tables for policies and usage metrics
    • Lambda functions for quota checking and monitoring
    • API Gateway with JWT authorizer for quota API
    • SNS topic for alerts

Configuration

During ccwb init:

  • Monthly token limit (default: 225M)
  • Daily limit with burst buffer (5-25%, default 10%)
  • Enforcement modes for daily/monthly limits
  • Quota re-check interval (0-60 min, default 30)

Part 2: EPCC Workflow Slash Commands Update

Updates EPCC workflow slash commands with improved multi-session support and feature tracking capabilities.

Changes

File Description
epcc-code.md Added session startup protocol for long-running projects
epcc-commit.md Added feature verification gate for tracked projects
epcc-explore.md Improved autonomous exploration patterns
epcc-plan.md Added feature list finalization section
prd.md Added feature list generation capability
trd.md Added technical feature enrichment section
epcc-resume.md New - Command for multi-session work resumption

Files Changed

Quota Monitoring

Area Files
CLI cli/commands/quota.py (new), init.py, deploy.py, test.py, destroy.py
Config config.py, quota_policies.py (new)
Credential Provider credential_provider/__main__.py
Infrastructure quota-monitoring.yaml, quota_check/index.py (new), quota_monitor/index.py, metrics_aggregator/index.py
Docs QUOTA_MONITORING.md, CLI_REFERENCE.md, okta-setup.md

EPCC Workflow

Area Files
Commands assets/claude-code-plugins/plugins/epcc-workflow/commands/*.md (7 files)

Test plan

Quota Monitoring

  • Run ccwb init and configure quota monitoring
  • Deploy with ccwb deploy quota
  • Test quota CLI commands (set-user, set-group, list, show, usage, delete)
  • Verify quota API authentication with JWT
  • Test enforcement modes (alert vs block)
  • Verify browser notifications for warning/blocked states
  • Test periodic re-check with cached credentials
  • Run ccwb test --quota-only for automated validation

EPCC Workflow

  • Test /epcc-explore with --quick and --deep flags
  • Test /epcc-plan feature list generation
  • Test /epcc-code session startup protocol
  • Test /epcc-commit feature verification gate
  • Test /epcc-resume for multi-session continuity

Jawhny Cooke and others added 20 commits November 26, 2025 16:15
…ase 2)

Implements comprehensive quota management system with real-time enforcement:

## New Features
- Fine-grained quota policies: user > group > default precedence
- Enforcement modes: "alert" (default) vs "block" for access control
- Real-time quota check API (Lambda + API Gateway HTTP API)
- Admin override via `ccwb quota unblock` command
- Enhanced SNS alerts with actionable CLI commands

## CLI Commands Added
- `ccwb quota set-user <email>` - Set user-specific quota policy
- `ccwb quota set-group <group>` - Set group quota policy
- `ccwb quota set-default` - Set default policy for all users
- `ccwb quota list` - List all configured policies
- `ccwb quota show <type> <id>` - Show policy details
- `ccwb quota delete <type> <id>` - Delete a policy
- `ccwb quota usage <email>` - Check user's current usage
- `ccwb quota unblock <email>` - Temporarily unblock a user

## Quota Limits Supported
- Monthly token limits (e.g., 300M, 1B)
- Daily token limits (e.g., 15M)
- Monthly cost limits (USD)
- Configurable warning thresholds (80%, 90%)

## Infrastructure
- QuotaCheckFunction Lambda for real-time checks
- API Gateway HTTP API endpoint
- DynamoDB tables: QuotaPolicies, UserQuotaMetrics
- Fail-open error handling (configurable)

## Key Design Decisions
- Blocking only at credential issuance (not mid-session)
- Most restrictive group policy wins for multi-group users
- TTL-based auto-expiry for unblock overrides

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
…ota API

- Add OIDC parameters (OidcIssuerUrl, OidcClientId) to quota stack deployment
- Change default monthly token limit from 300M to 10M tokens
- Default quota monitoring to enabled when monitoring is enabled
- Add descriptive text explaining quota monitoring features in init prompt
- Add dependency note clarifying quota requires monitoring stack
- Update CLI Reference with guidance on ccwb deploy vs ccwb deploy quota
- Secure quota check API with JWT authentication via API Gateway authorizer
- Extract user identity from validated JWT claims (no query parameter spoofing)
- Update credential provider to send JWT token in Authorization header
- Fix destroy, status, and distribute commands to use active profile instead of hardcoded "default"
- Add AWS Account ID display to init configuration summary table
- Update CLI_REFERENCE.md documentation to reflect profile behavior changes

These commands now properly fall back to the active profile from ~/.ccwb/config.json
when --profile is not specified, preventing "Profile 'default' not found" errors.
Display quota monitoring enabled/disabled status in the configuration
summary table shown at the end of ccwb init, between Monitoring and
Analytics Pipeline rows.
Add new test step (Step 6) to verify quota monitoring API is accessible
and responding correctly when quota monitoring is enabled.

- Add _test_quota_api method that gets JWT via --get-monitoring-token
  and calls the /check endpoint with Bearer authentication
- Integrate test into flow after inference profiles test
- Add skip status display ("-") for tests that are intentionally skipped
- Update summary to show skipped count when applicable
- Handle various scenarios: quota disabled (skip), enabled but no
  endpoint (warning), API works (pass/warning based on allowed status),
  auth/connection failures (fail)
Display quota monitoring stack outputs in the deployment summary:
- Quota API Endpoint
- Alert Topic ARN
- User Metrics Table
- Policies Table

Only shown when quota_monitoring_enabled is True.
- Add Quota Management section to TOC
- Document all 8 quota subcommands (set-user, set-group, set-default,
  list, delete, show, usage, unblock)
- Update init section to mention quota configuration
- Update test section to mention quota API testing
- Link to QUOTA_MONITORING.md for detailed architecture
- Fix test command to find packages in nested dist/{profile}/{timestamp}/ structure
- Fix quota API test to pass --profile flag and run from package directory
- Add auto-detection of profile from package config.json
- Save quota API endpoint to profile after deploy
- Fix datetime.isoformat() bug in quota policy creation
- Add profile auto-detection fallback in credential provider
- Update CLI_REFERENCE.md test command documentation
…tion

Add auto-calculated daily token limits to prevent unexpected costs from
runaway usage. Daily limits are computed from monthly quota with a
configurable burst buffer (5-25%, default 10%).

Changes:
- Add daily_token_limit, burst_buffer_percent, daily/monthly enforcement
  mode fields to Profile config
- Enhance ccwb init with daily limit prompts, burst buffer guidance,
  and separate enforcement mode selection
- Add DailyTokenLimit, DailyEnforcementMode, MonthlyEnforcementMode
  parameters to quota-monitoring.yaml CloudFormation template
- Pass new parameters from deploy command to CloudFormation
- Update QUOTA_MONITORING.md with "Daily Limits and Bill Shock Protection"
  section explaining calculation and recommended settings
- Update CLI_REFERENCE.md with new init prompts

Defaults: daily=alert (warn only), monthly=block (deny access)
Continue deleting stacks even when some fail, and provide detailed
manual cleanup instructions for resources that can't be auto-deleted.

Changes:
- Add get_failed_resources() to CloudFormationManager to query
  DELETE_FAILED resources from CloudFormation
- Update destroy command to collect failed resources across all stacks
  instead of stopping on first failure
- Add _show_cleanup_summary() with resource-specific AWS CLI commands
  for S3 buckets, CloudWatch log groups, DynamoDB tables, ECR repos
- Update _delete_stack() to return different codes for success (0),
  partial success with retained resources (1), and actual errors (2)
- Return success (0) when all stacks processed, even if some resources
  need manual cleanup
Update default monthly token limit to better reflect realistic usage
patterns for Claude Code with Bedrock.

Changes:
- Update config.py default from 10M to 225M (with 80%/90% thresholds)
- Update init.py default prompt values
- Update deploy.py fallback defaults
- Update QUOTA_MONITORING.md examples and documentation
- Update CLI_REFERENCE.md examples
Add comprehensive guide for configuring Okta to support quota monitoring:
- Required JWT scopes (openid, email, profile, groups)
- Step-by-step instructions for adding groups scope
- Groups claim configuration for JWT tokens
- Token lifetime settings and recommendations
- Creating groups for quota policies
- ccwb commands for deploying and configuring quotas
- Verification steps using jwt.io
- Link to full QUOTA_MONITORING.md documentation
The cursor pointer and highlighted option were out of sync because
`checked` parameter was used on questionary.Choice, but that's only
valid for checkbox() not select(). For select(), the default parameter
should be on the select() call itself.

Changes:
- Remove `checked` parameter from Choice objects for select menus
- Add `default` parameter to select() calls for model, profile, and
  source region selections
- Checkbox selections (subnet picker) correctly use `checked`
The "Resources to be created" list now shows quota monitoring
infrastructure when quota monitoring is enabled:
- DynamoDB tables for quota tracking
- Lambda function for quota checking
- API Gateway for real-time quota API
Add visual browser notifications for quota warnings and blocks:
- Show progress bars for monthly/daily usage when >= 80%
- Yellow warning page for approaching limits (80-99%)
- Red blocked page when access is denied (100%+)
- Terminal output preserved alongside browser notifications

Add dedicated quota testing command:
- New --quota-only flag runs 6 quota-specific tests
- New --quota-api flag to override endpoint for testing
- Tests: config validation, API, policy CRUD operations

Fix deploy command to save quota table names:
- Save PoliciesTableName and QuotaTableName from stack outputs
- Enables quota tests to work without manual configuration

Update documentation:
- Add User Notifications section to QUOTA_MONITORING.md
- Update CLI_REFERENCE.md with new test flags
Close the 12-hour enforcement gap by re-checking quota even when
credentials are cached. Configurable interval (default 30 min) set
during ccwb init.

Changes:
- Add quota_check_interval field to Profile config (default: 30 min)
- Add init prompt for re-check interval with guidance
- Add _should_recheck_quota() to check if interval has elapsed
- Add _get_last_quota_check_time() and _save_quota_check_timestamp()
- Add _get_cached_token_claims() to retrieve email from monitoring token
- Modify run() to check quota when returning cached credentials
- Update QUOTA_MONITORING.md with "Periodic Quota Re-Check" section
- Update CLI_REFERENCE.md init documentation

Intervals: 0=every request, 30=default, 60=relaxed
Display the quota re-check interval in the Step 4 Configuration Summary
table alongside monthly/daily limits and enforcement modes.
Add ccwb quota export and ccwb quota import commands for bulk policy
management. Supports JSON and CSV formats with conflict handling.

New commands:
- quota export <file> - Export policies to JSON/CSV
- quota import <file> - Import policies with --skip-existing, --update,
  --dry-run, --auto-daily, and --burst options

Changes:
- Add _format_tokens() and _parse_tokens() utilities to quota_policies.py
- Add export_policies() method to QuotaPolicyManager
- Add bulk_import_policies() with validation and conflict handling
- Add QuotaExportCommand and QuotaImportCommand CLI classes
- Register new commands in cli/__init__.py
- Add command documentation to CLI_REFERENCE.md
- Add "Bulk Policy Management" section to QUOTA_MONITORING.md

Features:
- Auto-detect JSON/CSV format by file extension
- Human-readable token values (300M instead of 300000000)
- Auto-calculate daily limits with configurable burst buffer
- Dry-run mode to preview changes
- Type filtering for selective import/export
Remove all cost-based quota tracking and enforcement. Quota management
is now purely token-based (monthly and daily limits only).

Rationale:
- Cost calculations depend on accurate pricing tables that can become stale
- Different model pricing and cache token handling adds complexity
- Token-based limits are deterministic and easier to understand
- Reduces maintenance burden and potential for miscalculation

Changes:
- Remove monthly_cost_limit from QuotaPolicy model
- Remove BEDROCK_PRICING dictionary and calculate_cost() from models.py
- Remove --cost-limit option from CLI quota commands
- Remove cost enforcement from quota_check Lambda
- Remove cost alerts from quota_monitor Lambda (format_cost_alert)
- Remove cost calculation from metrics_aggregator Lambda
- Remove cost display from credential_provider browser notifications
- Update CLI_REFERENCE.md and QUOTA_MONITORING.md documentation

The estimated_cost field in DynamoDB can remain (no schema change) - it
will simply not be populated or used. Future cleanup can remove it.
- Updated epcc-code.md with session startup protocol for multi-session support
- Updated epcc-commit.md with feature verification gate
- Updated epcc-explore.md with improved autonomous exploration
- Updated epcc-plan.md with feature list finalization
- Updated prd.md with feature list generation
- Updated trd.md with technical feature enrichment
- Added new epcc-resume.md for multi-session work resumption
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant