Bug: sparse (g,b) modeling requires hard-coded workaround

## Summary
When modeling nodal allocation with sparse `(generator,bus)` connectivity, Arco currently pushes users toward hard-coded constraints and hard-coded per-generator bus sets.

This is causing brittle model code and blocks reusable formulations.

## Current behavior
Given sparse data (only some `(g,b)` rows exist in `dist.csv`):

- `if {distance[g,b]}` in a generated constraint fails with:
  - `unsupported parameter reference 'distance'`
- Replacing with `distance_km[g,b]` leads to:
  - `missing required data point distance_km for key ...`
- Trying selector/filter variants to express pair intersection is not sufficient in current behavior for this case.

## Expected behavior
Users should be able to model sparse pair intersections **without hard-coding** per-generator sets or duplicating constraints per `(area, tech)` literal.

---

## Minimal example (desired style)
```kdl
data gen_sites from="gen_sites.csv" {
  set generators alias=g
  set area alias=a
  set tech alias=i
  param cost_spur_usd_per_km_mw {index generators}
}

data bus_sites from="bus_sites.csv" {
  set buses alias=b
}

data distance from="dist.csv" {
  index generators buses
  param distance_km {index generators; index buses}
}

data target from="target.csv" {
  param mw_target {index area; index tech}
}

model M {
  control x lower=0 {index generators; index buses}

  // This should work over sparse connectivity
  constraint capacity_target {
    index a { in area }
    index i { in tech }
    expression {
      sum(x[g,b] for g in generators[area=a tech=i] for b in buses if distance_km[g,b]) >= mw_target[a,i]
    }
  }
}
```

---

## Simple workaround currently required (hard-coded)
```kdl
// generated from dist.csv outside the model
set buses_for_g1 { "b1"; "b2" }
set buses_for_g2 { "b2" }

constraint capacity_target_a1_solar {
  expression {
    sum(x["g1",b] for b in buses_for_g1)
    + sum(x["g2",b] for b in buses_for_g2)
    >= 500
  }
}

minimize TotalCost {
  sum((distance_km["g1",b] * cost_spur_usd_per_km_mw["g1"]) * x["g1",b] for b in buses_for_g1)
  +
  sum((distance_km["g2",b] * cost_spur_usd_per_km_mw["g2"]) * x["g2",b] for b in buses_for_g2)
}
```

This works, but scales poorly and is not maintainable.

---

## Requirements to remove hard-coding
1. First-class sparse pair iteration domain (iterate existing `(g,b)` rows directly).
2. Safe existence semantics in filters/conditions for sparse lookups.
3. Dynamic intersection filtering with bound loop vars (`a`, `i`, `g`, `b`).
4. Early index-signature validation (`param` declared indices vs use-site indices).
5. No panics on invalid selector syntax; always structured diagnostics.

Nice-to-have:
- Better diagnostics that suggest sparse-domain iteration when cartesian+missing keys are detected.
- Canonical sparse network example in docs.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug: sparse (g,b) modeling requires hard-coded workaround #167

Summary

Current behavior

Expected behavior

Minimal example (desired style)

Simple workaround currently required (hard-coded)

Requirements to remove hard-coding

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Bug: sparse (g,b) modeling requires hard-coded workaround #167

Description

Summary

Current behavior

Expected behavior

Minimal example (desired style)

Simple workaround currently required (hard-coded)

Requirements to remove hard-coding

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions