Skip to content

GPU vendor/model SQL prefilter with denormalized tokens #513

@linear

Description

@linear

What

Tighten the prefilter to filter providers on requested GPU vendor/model in SQL, instead of leaving every GPU workload to the bin-packer. Two coordinated changes — write-side and read-side — ship together so the read side never observes a half-populated gpu_models column.

Write side. Update projectRow (apps/provider-inventory/src/lib/project-row/project-row.ts) to emit both the full vendor/name token and the vendor-only token into gpuModels. This makes vendor-only requests filterable with the same array overlap operator the read side already uses.

Case-sensitivity. Postgres && on text[] is case-sensitive. The bin-packer's matchesGPU already normalizes case at match time, masking case mismatches today — but provider API is case sensitive.

Read side. Extend the aggregator (from Issue 1) with per-unit GPU constraints, and have the repository iterate them when composing the SQL:

interface BidScreeningFilter {
  // ... existing capacity fields
  units: {
    gpuTokens: string[];   // OR-alternatives for one resource unit
                           //   [nvidia/H100, nvidia/A100] (multi-spec OR within one unit)
                           //   [nvidia] (wildcard model → vendor-only)
                           //   []      (no GPU on this unit → omit clause)
  }[];
}

For every unit with non-empty gpuTokens, the repository emits gpu_models && ARRAY[$1,$2,...]::text[], all AND'd in the WHERE clause.

Why per-unit and not flat: the bin-packer enforces replica consistency per resource unit (Akash spec ErrGroupResourceMismatch), and across-unit constraints are AND'd. A flat union lets through providers that satisfy only one of two services' GPU needs.

Acceptance criteria

  • projectRow populates gpuModels with both vendor/name and vendor tokens for every physical GPU.
  • Aggregator emits per-unit gpuTokens covering: (a) vendor-only (wildcard model) → vendor token only, (b) explicit vendor/model → specific token, (c) multiple OR-alternative GPU attributes in one unit → all tokens, (d) units without GPUs → empty array.
  • Repository emits one gpu_models && clause per non-empty unit, AND'd together.
  • Integration tests: vendor-only request matches mixed-model providers; multi-spec-OR within a unit; multi-unit AND with divergent GPU needs; no-GPU unit emits no clause; vendor-only request excludes wrong-vendor providers
  • No regression in capacity / signedBy / self-attribute prefilter behavior from Issue 1.

Notes

GPU ram and interface constraints intentionally not in SQL — bin-packer covers them.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions