Summary
When modeling nodal allocation with sparse (generator,bus) connectivity, Arco currently pushes users toward hard-coded constraints and hard-coded per-generator bus sets.
This is causing brittle model code and blocks reusable formulations.
Current behavior
Given sparse data (only some (g,b) rows exist in dist.csv):
if {distance[g,b]} in a generated constraint fails with:
unsupported parameter reference 'distance'
- Replacing with
distance_km[g,b] leads to:
missing required data point distance_km for key ...
- Trying selector/filter variants to express pair intersection is not sufficient in current behavior for this case.
Expected behavior
Users should be able to model sparse pair intersections without hard-coding per-generator sets or duplicating constraints per (area, tech) literal.
Minimal example (desired style)
data gen_sites from="gen_sites.csv" {
set generators alias=g
set area alias=a
set tech alias=i
param cost_spur_usd_per_km_mw {index generators}
}
data bus_sites from="bus_sites.csv" {
set buses alias=b
}
data distance from="dist.csv" {
index generators buses
param distance_km {index generators; index buses}
}
data target from="target.csv" {
param mw_target {index area; index tech}
}
model M {
control x lower=0 {index generators; index buses}
// This should work over sparse connectivity
constraint capacity_target {
index a { in area }
index i { in tech }
expression {
sum(x[g,b] for g in generators[area=a tech=i] for b in buses if distance_km[g,b]) >= mw_target[a,i]
}
}
}
Simple workaround currently required (hard-coded)
// generated from dist.csv outside the model
set buses_for_g1 { "b1"; "b2" }
set buses_for_g2 { "b2" }
constraint capacity_target_a1_solar {
expression {
sum(x["g1",b] for b in buses_for_g1)
+ sum(x["g2",b] for b in buses_for_g2)
>= 500
}
}
minimize TotalCost {
sum((distance_km["g1",b] * cost_spur_usd_per_km_mw["g1"]) * x["g1",b] for b in buses_for_g1)
+
sum((distance_km["g2",b] * cost_spur_usd_per_km_mw["g2"]) * x["g2",b] for b in buses_for_g2)
}
This works, but scales poorly and is not maintainable.
Requirements to remove hard-coding
- First-class sparse pair iteration domain (iterate existing
(g,b) rows directly).
- Safe existence semantics in filters/conditions for sparse lookups.
- Dynamic intersection filtering with bound loop vars (
a, i, g, b).
- Early index-signature validation (
param declared indices vs use-site indices).
- No panics on invalid selector syntax; always structured diagnostics.
Nice-to-have:
- Better diagnostics that suggest sparse-domain iteration when cartesian+missing keys are detected.
- Canonical sparse network example in docs.
Summary
When modeling nodal allocation with sparse
(generator,bus)connectivity, Arco currently pushes users toward hard-coded constraints and hard-coded per-generator bus sets.This is causing brittle model code and blocks reusable formulations.
Current behavior
Given sparse data (only some
(g,b)rows exist indist.csv):if {distance[g,b]}in a generated constraint fails with:unsupported parameter reference 'distance'distance_km[g,b]leads to:missing required data point distance_km for key ...Expected behavior
Users should be able to model sparse pair intersections without hard-coding per-generator sets or duplicating constraints per
(area, tech)literal.Minimal example (desired style)
Simple workaround currently required (hard-coded)
This works, but scales poorly and is not maintainable.
Requirements to remove hard-coding
(g,b)rows directly).a,i,g,b).paramdeclared indices vs use-site indices).Nice-to-have: