Skip to content

DEV task: Add --region-fit flag and revamp VM size tiers with better SKU selection. TWO FEATURES IN ONE PR: ## Feature 1: --region-fit flag When the user passes --region-fit (or when a creation fail #986

@rysweet

Description

@rysweet

Task Description

DEV task: Add --region-fit flag and revamp VM size tiers with better SKU selection.

TWO FEATURES IN ONE PR:

Feature 1: --region-fit flag

When the user passes --region-fit (or when a creation fails with QuotaExceeded/SkuNotAvailable), azlin should automatically find a region where the requested VM size is available and has enough quota.

Implementation:

  1. Add --region-fit flag to azlin new CLI (in rust/crates/azlin-cli/src/lib.rs)
  2. Add a new function find_available_region() in rust/crates/azlin-azure/src/vm.rs or a new module:
    • Takes: required_cores (u32), preferred_regions (Vec<&str>), resource_group (&str)
    • For each candidate region (in parallel with short timeouts):
      • Check quota via az vm list-usage --location {region}
      • Check SKU availability via az vm list-skus --location {region} --size {sku}
    • Return first region with enough quota AND available SKU
    • Candidate regions: westus, westus2, westus3, centralus, eastus, eastus2, northcentralus, southcentralus (configurable)
  3. In handle_vm_new() in cmd_vm_ops.rs:
    • If --region-fit is passed, call find_available_region() before create_vm()
    • Print which region was selected and why
    • If no region found, show a table of all regions with their quota/availability
  4. Also integrate with quota error recovery: when create_vm fails with QuotaExceeded, suggest running with --region-fit

Feature 2: Revamp VM size tiers

Current tiers are too coarse and only use D-series v3:
s = Standard_D2s_v3 (2 cores)
m = Standard_D16s_v3 (16 cores)
l = Standard_D32s_v3 (32 cores)
xl = Standard_D64s_v3 (64 cores)

Problems:

  • No 4-core or 8-core options (big jump from 2 to 16)
  • Only D-series v3 (older generation, often capacity-constrained)
  • No E-series option (memory-optimized, already used in fleet as E32as_v5)

New tier system (update in cmd_vm_ops.rs where VmSizeTier is mapped):
xs = Standard_D2s_v5 (2 cores, 8GB) - cheapest dev box
s = Standard_D4s_v5 (4 cores, 16GB) - light dev
m = Standard_D8s_v5 (8 cores, 32GB) - standard dev
l = Standard_D16s_v5 (16 cores, 64GB) - heavy dev
xl = Standard_D32s_v5 (32 cores, 128GB) - power user
2xl = Standard_D64s_v5 (64 cores, 256GB) - maximum

Key changes:

  • Move from v3 to v5 (newer, better availability, same or lower cost)
  • Add xs (2-core) and 2xl (64-core) tiers
  • Make the existing xl = 32 cores instead of 64 (more realistic default)
  • Add 2xl for the current 64-core use case
  • Update VmSizeTier enum in azlin-cli/src/lib.rs
  • Update the match in cmd_vm_ops.rs
  • Update help text and docs

Also add a --vm-family flag or config option to prefer E-series:
--vm-family d (default, general purpose)
--vm-family e (memory-optimized, maps to E*as_v5 series)

When --vm-family e:
s = Standard_E2as_v5
m = Standard_E8as_v5
l = Standard_E16as_v5
xl = Standard_E32as_v5
2xl = Standard_E64as_v5

Tests

  • Unit tests for find_available_region() with mocked az output
  • Tests for new tier mappings
  • Tests for --vm-family flag
  • All existing tests must pass

Key files

  • rust/crates/azlin-cli/src/lib.rs (CLI flags, VmSizeTier enum)
  • rust/crates/azlin/src/cmd_vm_ops.rs (tier mapping, handle_vm_new)
  • rust/crates/azlin-azure/src/vm.rs (find_available_region)
  • rust/crates/azlin-azure/src/error_handler.rs (suggest --region-fit on quota errors)

Requirements

● Read lib.rs
│ rust/crates/azlin-cli/src/lib.rs
└ L60:100 (41 lines read)

● Read lib.rs
│ rust/crates/azlin-cli/src/lib.rs
└ L140:250 (111 lines read)

● Read cmd_vm_ops.rs
│ rust/crates/azlin/src/cmd_vm_ops.rs
└ L1:50 (50 lines read)

● Read cmd_vm_ops.rs
│ rust/crates/azlin/src/cmd_vm_ops.rs
└ L225:320 (96 lines read)

● Read cmd_vm.rs
│ rust/crates/azlin/src/cmd_vm.rs
└ L1:70 (70 lines read)

● Read error_handler.rs
│ rust/crates/azlin-azure/src/error_handler.rs
└ L130:180 (50 lines read)

● Read vm.rs
│ rust/crates/azlin-azure/src/vm.rs
└ L1:40 (40 lines read)

● Read vm.rs
│ rust/crates/azlin-azure/src/vm.rs
└ L600:660 (61 lines read)

Now let me check the test file and tokio dependency details:

● Search (grep)
│ "tier_sku|test_all_cli_tiers|VmSizeTier"
│ (rust/crates/azlin/src/tests/test_group_70.rs)
└ 4 lines found

● Read test_group_70.rs
│ rust/crates/azlin/src/tests/test_group_70.rs
└ L1:60 (60 lines read)

● Read Cargo.toml
│ rust/crates/azlin-azure/Cargo.toml
└ 23 lines read

● Read test_group_70.rs
│ rust/crates/azlin/src/tests/test_group_70.rs
└ L246:275 (30 lines read)

I've completed my analysis. Here are the resolved ambiguities and final requirements:

Ambiguity Resolutions

  1. Xxl display name: Clap's ValueEnum auto-generates xxl from variant name Xxl. The help text will show xs, s, m, l, xl, xxl — close enough to "2xl". No custom value_name override needed; users can type --size xxl.

  2. Tokio for parallel region scanning: azlin-azure has tokio only as dev-dep. I'll implement synchronous per-region check helpers in azlin-azure/src/vm.rs and orchestrate parallel calls via tokio::spawn in cmd_vm_ops.rs (where tokio runtime is available).

  3. E-series xs tier: Will map to Standard_E2as_v5 (2 cores, 16GB) to maintain 1:1 tier-to-core-count parity with D-series.

  4. Test impact: test_group_70.rs tests at line 249 hardcode v3 SKU strings — will update to v5. The SSH timeout tests (lines 14-56) reference v3 SKUs but test the extract_core_count() function which already handles both v3/v5, so those remain valid (they test the timeout function, not tier mapping).

  5. --vm-family only applies to tier-based selection: When --vm-size is explicitly provided, --vm-family is ignored (direct SKU override wins).

  6. Backward-compat breaking change acknowledged: --size s goes from 2-core D2s_v3 to 4-core D4s_v5. This is intentional per spec.

Final Requirements — Ready for Design

All requirements are unambiguous. The explicit requirements, acceptance criteria, assumptions, and out-of-scope items from the clarified spec are complete and consistent with the codebase. Proceeding to design phase.

✓ Rust recipe runner available
✓ Disabled GitHub MCP server to save context tokens - using gh CLI instead
Using gh CLI with account: rysweet
To re-enable GitHub MCP, just ask: 'please use the GitHub MCP server'
✓ Enabled awesome-copilot MCP server (community extensions via Docker)
✓ XPIA security defender ready (/home/rysweet/.amplihack/bin/xpia-defend)
✓ Staged 6 hooks to .github/hooks/

Acceptance Criteria

  • All explicit requirements met
  • Tests passing
  • Philosophy compliant
  • Documentation updated

Classification

Generated via default-workflow recipe

Metadata

Metadata

Assignees

No one assigned

    Labels

    workflow:defaultIssues created by default workflow

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions