diff --git a/GLOSSARY.md b/GLOSSARY.md index b4f5a33..bdb8b1a 100644 --- a/GLOSSARY.md +++ b/GLOSSARY.md @@ -20,7 +20,7 @@ Unified terminology index for Spore. Each term links to the SEP where it is auth ## C -**Declared effects** (SEP-0003): The effect names explicitly written on a function, module, or manifest surface. +**Declared effects** (SEP-0003): The effect names explicitly written on a function or platform/manifest surface. **Effect ceiling** (SEP-0003, SEP-0008): The maximum set of effects available to a scope. Function-level `uses [...]` clauses are standardized today; broader module/project ceilings remain reserved follow-up design space. @@ -32,7 +32,7 @@ Unified terminology index for Spore. Each term links to the SEP where it is auth **`Channel[T]`** (SEP-0007): Bounded channel type for inter-task message passing, parameterized by the message type. -**Content-addressed package** (SEP-0008): Package identified by SHA-256 hash of its normalized source, enabling reproducible builds and cache deduplication. +**Content-addressed package** (SEP-0008): Package identified by BLAKE3 hashes of its normalized signatures and implementations, enabling reproducible builds and cache deduplication. **Cost budget** (SEP-0004): The declared cost bound on a function, verified at compile time against inferred cost of the function body. @@ -72,9 +72,9 @@ Unified terminology index for Spore. Each term links to the SEP where it is auth **Effect** (SEP-0003): An observable interaction with the outside world (I/O, mutation, randomness), tracked via the effect system. -**Effect alias** (SEP-0003): A named shorthand for a set of atomic effects, written as `effect FileIO = FileRead | FileWrite`. +**Effect alias** (SEP-0003): A named shorthand for a set of atomic effects, written as `effect FileIO = FileRead | FileWrite;`. -**Effect handler** (SEP-0008): Platform-provided implementation of an effect's operations, connecting `foreign fn` declarations to native code. +**Effect handler** (SEP-0003, SEP-0008): Implementation of an effect's operations. SEP-0003 owns handler semantics; SEP-0008 explains Platform-provided handlers and host adapters. **Enum** (SEP-0002): Algebraic data type with named variants, each optionally carrying data. Defined with `type Name { Variant1(T), Variant2 }`. @@ -114,7 +114,7 @@ Unified terminology index for Spore. Each term links to the SEP where it is auth **`Mul`** (SEP-0002): Compiler-known trait for the `*` operator on types that explicitly implement multiplication. -**Module** (SEP-0008): A single Spore source file that declares its own visibility boundaries and effect requirements. +**Module** (SEP-0008): A single Spore source file whose module path is derived from its filesystem path. Function-level signatures inside the module declare effects. ## N @@ -146,7 +146,7 @@ Unified terminology index for Spore. Each term links to the SEP where it is auth **`Serialize`** (SEP-0002): Compiler-known trait for converting a typed value into a serialized format. -**`select`** (SEP-0007): Expression that awaits the first of multiple tasks to complete, enabling concurrent race patterns. +**`select`** (SEP-0007): Expression that waits for the first ready channel arm or timeout arm, enabling concurrent race patterns. **SEP (Spore Enhancement Proposal)** (SEP-0000): A design document proposing a change or addition to Spore, following a structured review process. @@ -180,7 +180,7 @@ Unified terminology index for Spore. Each term links to the SEP where it is auth ## U -**`uses` clause** (SEP-0003): Annotation on a function or module declaring required effects, written as `uses [Effect1, Effect2]`. +**`uses` clause** (SEP-0003): Annotation on a function declaring required effects, written as `uses [Effect1, Effect2]`. ## V diff --git a/README.md b/README.md index 0fc540a..ada7b15 100644 --- a/README.md +++ b/README.md @@ -1,68 +1,94 @@ # spore-evolution -Proposal repository for the Spore programming project. +Proposal portal for the Spore programming project. -This repository is the long-lived home for major language, tooling, and process proposals that affect Spore as a whole. +This repository is the long-lived home for Spore Enhancement Proposals (SEPs): +process decisions, language design records, tooling protocols, package-system +design, and other cross-cutting changes that affect Spore as a whole. -## Release-safety notice +## Read this first -The SEPs in this repository are design records and proposal texts. They are not, -by themselves, the current implementation truth, a compatibility guarantee, or a -public release contract for Spore. +The SEPs in this repository are design records. They are not, by themselves, the +current implementation truth, a compatibility guarantee, or a public release +contract for Spore. -All numbered SEPs are currently `Draft`. During this bootstrap phase, readers -should expect some examples and terminology to lag behind the implementation. For current release behavior, installation guidance, supported syntax, and -implementation status, use the implementation repository first: `spore/README.md` -and `spore/docs/DESIGN.md`. +implementation status, start with the implementation repository: +`spore/README.md` and `spore/docs/DESIGN.md`. -**Authoritative surface typing (as of the alignment pass):** default unsuffixed literals are **`I64`** (integer) and **`F64`** (float); UTF-8 text is **`Str`**; there is **no `Char`**. Full rules, metavariables **ι** / **φ** for other widths, and “Platform decides” ABI details are in [SEP-0002 §3.1 / Summary](seps/SEP-0002-type-system.md). +This repository is authoritative for proposal history and accepted design +direction. During the bootstrap phase, Draft SEPs may still include target +behavior, future protocol shapes, or examples that are ahead of the compiler. -## What lives here +**Current surface typing baseline:** default unsuffixed literals are **`I64`** +for integers and **`F64`** for floats; UTF-8 text is **`Str`**; there is no +`Char` type. SEP-0002 owns the full type-system rules and metavariables for +other fixed widths. -- `drafts/` — unnumbered proposal drafts under active discussion -- `seps/` — numbered SEP documents and historical process records -- `templates/` — authoring templates for new proposals -- `schemas/` — machine-readable rules for SEP metadata and shared machine contracts -- `scripts/` — repository validation and automation helpers +## SEP status -## Current entry points +| SEP | Title | Status | Role | +|---|---|---|---| +| [SEP-0000](seps/SEP-0000-process.md) | Spore Enhancement Proposal Process | Accepted | Repository process and lifecycle | +| [SEP-0001](seps/SEP-0001-core-syntax.md) | Core Syntax & Signatures | Accepted | Root surface grammar and signature layout | +| [SEP-0002](seps/SEP-0002-type-system.md) | Type System | Draft | Type semantics and checking | +| [SEP-0003](seps/SEP-0003-effect-system.md) | Effect System | Draft | Effect algebra, handlers, and diagnostics | +| [SEP-0004](seps/SEP-0004-cost-analysis.md) | Cost Analysis & Decidability | Draft | Four-slot cost model and verification | +| [SEP-0005](seps/SEP-0005-hole-system.md) | Hole System & Agent Protocol | Draft | Typed holes and agent-facing reports | +| [SEP-0006](seps/SEP-0006-compiler-architecture.md) | Compiler Architecture | Draft | Compiler pipeline and diagnostics | +| [SEP-0007](seps/SEP-0007-concurrency-model.md) | Concurrency Model | Draft | Structured concurrency semantics | +| [SEP-0008](seps/SEP-0008-module-package-system.md) | Module & Package System | Draft | Modules, manifests, platforms, and packages | +| [SEP-0009](seps/SEP-0009-standard-library.md) | Standard Library Surface | Draft | Prelude, core modules, and platform libraries | -- [`VISION.md`](VISION.md) — design philosophy and core principles -- [`ROADMAP.md`](ROADMAP.md) — long-term plan organized by system area -- [`seps/SEP-0000-process.md`](seps/SEP-0000-process.md) — the draft SEP process -- [`seps-index.json`](seps-index.json) — generated machine-readable SEP index -- [`templates/standards-track.md`](templates/standards-track.md) — Standards Track template -- [`templates/process.md`](templates/process.md) — Process template -- [`templates/informational.md`](templates/informational.md) — Informational template +The generated machine-readable index is [`seps-index.json`](seps-index.json). -## What is an SEP? +## Reading path -**SEP** stands for **Spore Enhancement Proposal**. +Read [VISION.md](VISION.md) first for the design philosophy. Then use SEPs in +dependency order: -An SEP is the design record for changes to Spore semantics, standard-library -surface, tooling protocols, cross-cutting system design, or the project process itself. -For the decision threshold, lifecycle, and authoring rules, see -[`seps/SEP-0000-process.md`](seps/SEP-0000-process.md). +1. [SEP-0000](seps/SEP-0000-process.md) for how decisions are made. +2. [SEP-0001](seps/SEP-0001-core-syntax.md) for accepted syntax forms. +3. [SEP-0002](seps/SEP-0002-type-system.md) through [SEP-0004](seps/SEP-0004-cost-analysis.md) for core static semantics. +4. [SEP-0005](seps/SEP-0005-hole-system.md) and [SEP-0006](seps/SEP-0006-compiler-architecture.md) for tool and compiler surfaces. +5. [SEP-0007](seps/SEP-0007-concurrency-model.md) through [SEP-0009](seps/SEP-0009-standard-library.md) for larger system layers. + +Use [GLOSSARY.md](GLOSSARY.md) when checking cross-SEP terminology. + +## Repository layout + +- `drafts/` - unnumbered proposal drafts under active discussion +- `seps/` - numbered SEP documents and historical process records +- `templates/` - authoring templates for new proposals +- `schemas/` - machine-readable rules for SEP metadata and shared contracts +- `scripts/` - repository validation and automation helpers -## Status +## Authoring + +**SEP** stands for **Spore Enhancement Proposal**. An SEP records changes to +Spore semantics, standard-library surface, tooling protocols, cross-cutting +system design, or the project process itself. + +For the decision threshold, lifecycle, and authoring rules, see +[SEP-0000](seps/SEP-0000-process.md). New proposals should start from the +matching template: -The process in this repository is still being bootstrapped. +- [Standards Track](templates/standards-track.md) +- [Process](templates/process.md) +- [Informational](templates/informational.md) -`SEP-0000` is intentionally a draft. We expect to revise the process, template, and metadata rules before treating them as settled. +## Validation -## Tooling direction +Run the repository checks before opening a PR: -The current first-pass automation stack is: +```bash +uv run scripts/validate_sep_documents.py +uv run scripts/check_sep_index.py +uv run scripts/check_terminology_consistency.py +``` -- Shared local/CI hook runner: `prek` -- Markdown lint: `rumdl` -- Prose and terminology lint: `Vale` -- Link checking: `lychee` -- Spelling: `typos` -- Front matter schema validation: `check-jsonschema` -- Machine-readable SEP index: committed `seps-index.json`, checked for drift in local hooks and CI -- SEP-specific repository rules: minimal repo-local Python checks -- CI platform: GitHub Actions +If SEP metadata changed, regenerate the committed index first: -The static site stack is intentionally deferred for now. We want to stabilize the proposal workflow and quality gates first, then choose the publishing stack separately. +```bash +uv run scripts/check_sep_index.py --fix +``` diff --git a/seps-index.json b/seps-index.json index 30a0aee..9148269 100644 --- a/seps-index.json +++ b/seps-index.json @@ -20,7 +20,7 @@ "path": "seps/SEP-0001-core-syntax.md", "sep": 1, "title": "SEP-0001: Core Syntax & Signatures", - "status": "Draft", + "status": "Accepted", "type": "Standards Track", "authors": [ "Zhan Rongrui" @@ -59,6 +59,7 @@ ], "created": "2026-03-31", "requires": [ + 1, 2 ], "discussion": "https://github.com/spore-lang/spore-evolution/discussions/3", @@ -76,6 +77,7 @@ ], "created": "2026-03-31", "requires": [ + 1, 2, 3 ], @@ -94,6 +96,7 @@ ], "created": "2026-03-31", "requires": [ + 1, 2, 3, 4 @@ -135,6 +138,7 @@ "created": "2026-03-31", "requires": [ 1, + 2, 3, 4 ], @@ -154,7 +158,9 @@ "created": "2026-03-31", "requires": [ 1, - 3 + 2, + 3, + 4 ], "discussion": "https://github.com/spore-lang/spore-evolution/discussions/8", "pr": null, diff --git a/seps/SEP-0000-process.md b/seps/SEP-0000-process.md index b890452..8d46f0e 100644 --- a/seps/SEP-0000-process.md +++ b/seps/SEP-0000-process.md @@ -40,7 +40,7 @@ These changes have broad, long-term consequences. Normal implementation pull req - the relationship between human UX and Agent UX - migration and compatibility implications -Spore therefore needs a durable, versioned, reviewable design archive. +Spore therefore needs a durable, reviewable design archive. ## Goals diff --git a/seps/SEP-0001-core-syntax.md b/seps/SEP-0001-core-syntax.md index 047d315..bbb891a 100644 --- a/seps/SEP-0001-core-syntax.md +++ b/seps/SEP-0001-core-syntax.md @@ -1,7 +1,7 @@ --- sep: 1 title: "SEP-0001: Core Syntax & Signatures" -status: Draft +status: Accepted type: Standards Track authors: - Zhan Rongrui @@ -14,37 +14,51 @@ superseded_by: null # SEP-0001: Core Syntax & Signatures -> **Executive Summary**: Defines Spore's core syntax as an expressions-only language with no statements or loops — all iteration uses recursion and higher-order functions (map/fold/filter/each). Introduces unified function signatures with a fixed clause order (return → errors → where → uses → cost → spec → body), pattern matching as the primary control flow, and a clean separation between pure computation and effectful operations. Traits define type interfaces while effects track external operations; the compiler infers purity and determinism from the declared effect set. The `spec` clause embeds typechecked behavioral contracts (examples and properties) directly in function signatures, making intent visible to the compiler, documentation, and hole-reporting tooling. +> **Executive Summary**: Defines Spore's core syntax and function-signature layout. SEP-0001 fixes the shared surface forms that later SEPs interpret: type syntax, error clauses, effect clauses, cost clauses, holes, concurrency forms, imports, and `spec` blocks. Detailed type, effect, cost, hole, compiler, concurrency, module, and standard-library semantics are delegated to their corresponding SEPs. ## Summary -This proposal defines the complete core syntax and function signature system for the Spore programming language v0.1. Spore is an expression-based, statically typed language with effect tracking, explicit resource cost tracking, and guaranteed tail-call optimization. The language draws from Rust, OCaml, Roc, Gleam, and Elm, combining curly-brace block scoping with Rust-style semicolon semantics, algebraic data types, exhaustive pattern matching, and a novel function signature system that encodes generic constraints (`where`), effect requirements (`uses`), error sets (`!`), and a four-slot **`cost [compute, alloc, io, parallel]`** clause as composable signature metadata. Spore deliberately eliminates loops in favor of recursion and higher-order functions, uses square brackets for generics (`List[I64]`), auto-infers effect properties from the `uses` set, and provides a pipe operator (`|>`) for readable data-flow composition. +This proposal defines the core syntax and function-signature surface for the Spore programming language. Spore uses expression-oriented curly-brace syntax, Rust-style semicolon rules, algebraic data type syntax, pattern matching, square-bracket generics, and a regular function signature structure with optional `!`, `where`, `cost`, `uses`, and `spec` clauses. -## Motivation +This SEP specifies the **surface grammar** and cross-SEP signature layout. SEP-0002 specifies type meaning and checking, SEP-0003 specifies effect semantics, SEP-0004 specifies cost verification, SEP-0005 specifies holes, SEP-0006 specifies compiler behavior and diagnostics, SEP-0007 specifies concurrency semantics, SEP-0008 specifies module/package resolution, and SEP-0009 specifies the standard-library surface. SEP-0001 remains dependency-free so those later SEPs can depend on a stable syntactic root without creating cycles. -Modern programming languages force developers to choose between safety and ergonomics. Rust provides strong safety guarantees but with significant syntactic and cognitive overhead. Functional languages like Haskell and OCaml offer powerful type systems but often feel alien to developers from imperative backgrounds. Meanwhile, AI agents that generate and analyze code need languages whose structure is predictable and machine-parseable. +## Normative scope and dependency boundaries -Spore aims to occupy a unique design point: +SEP-0001 is the root syntax SEP. Later SEPs may assign semantics to forms defined here, but they must not re-specify incompatible grammar. -1. **Human-first readability**: Expression-based syntax with familiar curly braces and semicolons reduces the learning curve for developers coming from Rust, TypeScript, or Kotlin, while the pipe operator and pattern matching enable expressive functional composition. +| Surface area | SEP-0001 owns | Semantic owner | +| -------------------------------------------- | --------------------------------------------------------------------- | ------------------------------------------------------------------- | +| Function declarations | clause order, delimiters, optional body shape | SEP-0002 for type checking, SEP-0004 for cost, SEP-0003 for effects | +| Type syntax | identifiers, generic brackets, tuple/function/refinement syntax forms | SEP-0002 | +| Error clauses and `?` / `throw` syntax | spelling and placement | SEP-0002 for error-set typing and propagation | +| `effect`, `handler`, `perform`, `handle` | declaration and expression grammar | SEP-0003 | +| `cost [...]` | clause placement and expression syntax envelope | SEP-0004 | +| `spec { ... }` | clause placement and item grammar | SEP-0006 for test execution, diagnostics, and spec hashing | +| Holes (`?name`) | expression grammar | SEP-0005 | +| `parallel_scope`, `spawn`, `select`, `await` | expression grammar | SEP-0007 | +| `import`, `alias`, visibility | declaration grammar | SEP-0008 | +| Prelude and library names | examples only | SEP-0009 | -2. **Agent-friendly structure**: A fixed operator set (no custom operators), regular grammar, and explicit function metadata (`uses`, `cost`, `where`) make Spore code trivially parseable by AI agents and static analysis tools. Every function signature is a structured contract. +Unless explicitly marked as grammar, examples that mention type names, effect names, cost values, module paths, or standard-library items are illustrative. -3. **Effect safety by default**: Rather than bolting on an effect system after the fact, Spore makes effect requirements a first-class part of function signatures. The compiler infers properties like `pure` and `deterministic` from the declared `uses` set, eliminating manual annotation burden while maintaining full transparency. +## Motivation -4. **No loops — recursion with TCO guarantee**: By removing `for`, `while`, `loop`, `break`, and `continue`, Spore encourages a purely functional iteration style via recursion and higher-order functions (`map`, `fold`, `filter`). The compiler guarantees tail-call optimization, making recursion safe and efficient. +Spore's syntax is designed to balance human readability with predictable machine processing. Curly-brace blocks, explicit delimiters, a fixed operator set, and ordered function clauses make source text regular enough for compilers, formatters, LSPs, and agents to process without relying on implicit layout or custom operator rules. -5. **Incremental elaboration**: The signature system is designed so that a simple pure function has zero overhead (`fn add(x: I64, y: I64) -> I64`), while complex functions progressively add clauses only as needed. This avoids the "all or nothing" annotation burden seen in many effect systems. +Spore aims to occupy a practical syntactic design point: -6. **Cost transparency**: The `cost` clause enables compile-time reasoning about resource consumption, supporting smart-contract use cases and agent-driven cost optimization. +1. **Human-readable syntax**: expression-oriented syntax with familiar braces, semicolons, `let`, `fn`, `match`, and postfix type annotations. +2. **Machine-readable structure**: a fixed operator set, explicit delimiters, and a predictable function-clause order. +3. **Progressive signatures**: simple functions remain short, while functions that need additional metadata can add clauses in one canonical order. +4. **One iteration style**: loop keywords are absent from the surface language; recursive and library-based iteration are the accepted syntactic idioms. ## Guide-level explanation -This section introduces Spore's syntax from a user's perspective, building from the simplest forms to the most complex. +This section introduces the surface forms defined by this SEP. ### Your first Spore function -The simplest Spore function looks familiar to anyone who has used Rust, Go, or TypeScript: +A minimal Spore function has a name, typed parameters, a return type, and a block body: ```spore fn add(a: I64, b: I64) -> I64 { @@ -52,7 +66,7 @@ fn add(a: I64, b: I64) -> I64 { } ``` -Note: no `return` keyword is needed. The last expression in a block is its value. The compiler automatically infers that this function is `pure`, `deterministic`, and `total` because it uses no effects. +The last expression in a block is the block value. ### Variables and immutability @@ -64,14 +78,14 @@ let age = 30; let greeting = f"Hello, {name}! You are {age} years old."; ``` -Shadowing is allowed — you can rebind a name in the same scope: +Shadowing reuses a binding name in a later binding: ```spore let x = 10; let x = x + 5; // shadows the previous x ``` -For local mutable state, use `Ref[T]`. This is separate from the built-in external effect vocabulary standardized in SEP-0003, so it does not introduce a dedicated built-in `uses` name: +For local mutable state, `Ref[T]` is written as a normal generic type: ```spore let counter = Ref.new(0); @@ -113,7 +127,7 @@ let category = if age < 13 { ### Pattern matching -`match` expressions must be exhaustive — the compiler ensures all cases are covered: +`match` expressions use comma-terminated arms: ```spore type Shape = @@ -146,9 +160,24 @@ let is_weekend = match day { }; ``` -### No loops — recursion and HOFs +### No loop syntax + +Spore has **no** `for`, `while`, `loop`, `break`, or `continue` syntax. Iteration is expressed with recursion or library functions. -Spore has **no** `for`, `while`, `loop`, `break`, or `continue`. All iteration uses recursion (with guaranteed TCO) or higher-order functions: +**Local function declarations.** The block grammar includes a `Statement` form +labeled _local function_ below: nesting `fn` introduces a sibling function scoped +inside that enclosing block. Unlike lambdas (`|params|`), a statement-form `fn` +**does not** close over outer `let` bindings or the enclosing function's +parameters; pass any needed outer state explicitly as arguments to helpers. +Lexical closures and inference rules for lambdas belong to SEP-0002 (**§3.5.1** contrasts nested statement `fn` vs lambdas). + +**Cost and stack space.** How the compiler recognises linear recursion and +computes inferred or declared `cost […]` bounds is SEP-0004. **Tail calls.** +Calls in syntactic tail position (as in the accumulator helper below) are part of the +canonical evaluation model: SEP-0006 assigns them a merging lowering rule so deeply +tail-recursive iteration does **not** build a proportional chain of suspended +caller frames. That restores the predictable stack footprint programmers associate +with loops, even though the surface language has no loop forms. ```spore // Sum a list using fold @@ -156,15 +185,16 @@ fn sum(numbers: List[I64]) -> I64 { numbers.fold(0, |acc, x| acc + x) } -// Factorial using tail recursion (TCO guaranteed) +// Factorial: inner helper carries the accumulator; parameters `n` vs `k` are +// distinct (no accidental shadow of the caller's iteration variable). fn factorial(n: I64) -> I64 { - fn go(n: I64, acc: I64) -> I64 { - if n <= 1 { acc } else { go(n - 1, n * acc) } + fn go(k: I64, acc: I64) -> I64 { + if k <= 1 { acc } else { go(k - 1, k * acc) } } go(n, 1) } -// Filter and transform using HOFs +// Filter and transform using library functions let result = numbers |> filter(|x| x > 0) |> map(|x| x * 2) @@ -188,8 +218,14 @@ let result = data // With additional arguments: x |> f(y, z) // equivalent to: f(x, y, z) x |> f(_, y) // equivalent to: f(x, y) +f(_, y) // equivalent to: |x| f(x, y) ``` +The `_` placeholder form is a general call-position partial application form: +any call argument written as `_` is converted into a lambda parameter. Pipe use +is only the most common spelling because `x |> f(_, y)` reads as "send `x` into +the placeholder". + ### Ranges and slices Ranges create sequences; slices extract sub-lists: @@ -251,54 +287,9 @@ fn read_config(path: Str) -> Config ! FileError | ParseError { } ``` -Current implementation note: the parser surface already accepts `! E1 | E2`, and -the checker currently validates `?` through call/pipeline propagation rules. This -wave standardizes the **target** canonicalization-first error-set semantics on top -of that shipping surface. It does **not** require a new top-level -`error Alias = ...` declaration form. - -#### Error conversion - -When a function's error set differs from a callee's, Spore can automatically convert errors if a conversion exists: - -```spore -fn load_config(path: Str) -> Config ! ConfigError { - let content = read_file(path)?; // FileError -> ConfigError (auto-converted) - let config = parse_toml(content)?; // ParseError -> ConfigError (auto-converted) - config -} -``` - -The compiler looks for a `From` trait implementation (e.g., `impl From[FileError] for ConfigError`) to perform the conversion. If no conversion exists, the error type must be listed in the function's error set directly. - -#### Error recovery patterns - -Use `match` on `Result` to provide fallback values or retry logic: - -```spore -// Default value fallback -fn get_config() -> Config { - match load_config("config.toml") { - Ok(config) => config, - Err(_) => Config.default(), - } -} - -// Retry with tail recursion (TCO guaranteed) -fn fetch_with_retry(url: Str, max_retries: I64) -> Data ! NetworkError { - fn retry(url: Str, attempts: I64, max: I64) -> Data ! NetworkError { - match fetch(url) { - Ok(data) => data, - Err(NetworkError.Timeout) if attempts < max => { - sleep(1000); - retry(url, attempts + 1, max) // tail recursion - }, - Err(e) => throw e, - } - } - retry(url, 0, max_retries) -} -``` +Error-set typing, propagation, conversion, and recovery patterns are owned by +SEP-0002. SEP-0001 only fixes the spelling and placement of `!`, `?`, and +`throw`. ### Generic types with square brackets @@ -325,67 +316,50 @@ fn identity[T](x: T) -> T { ### Function signatures: from simple to complex -The full signature order is: +The canonical signature order is: ```text fn name[T](params) -> ReturnType ! ErrorSet where T: Bound -uses [Effect1, Effect2] cost [compute, alloc, io, parallel] +uses [Effect1, Effect2] spec { - example "...": ... - property "...": |param: Type| ... + } { body } ``` -But you only write what you need: +Only the clauses that are needed are written: ```spore -// 1. Simple pure function — zero overhead fn add(a: I64, b: I64) -> I64 { a + b } -// 2. Function with errors fn parse_int(input: Str) -> I64 ! InvalidFormat { - ... + ?parse_int_body } -// 3. With generic constraints and behavioral spec fn serialize[T](value: T) -> Bytes ! SerializeError where T: Serialize cost [500, 50, 0, 1] -spec { - example "empty struct": serialize(Empty {}) == Ok(b"") -} { - ... + ?serialize_body } -// 4. Side-effectful function with intent examples -/// @idempotent fn sync_user_data(user_id: UserId, source: DataSource) -> SyncReport ! NetworkTimeout | AuthExpired -uses [NetConnect, FileRead, Clock] cost [8500, 200, 800, 4] -spec { - example "returns report on success" { - let report = sync_user_data(UserId(1), MockSource.success()); - report.is_ok() && report.unwrap().records_synced > 0 - } - property "idempotent": |id: UserId| { - let r1 = sync_user_data(id, MockSource.success()); - let r2 = sync_user_data(id, MockSource.success()); - r1 == r2 - } -} +uses [NetConnect, FileRead, Clock] { - ... + ?sync_body } ``` +Detailed behavior of error sets, cost vectors, effect requirements, and `spec` +execution is delegated to SEP-0002, SEP-0004, SEP-0003, and SEP-0006. + ### Traits, effects, and the `uses` clause Spore separates type interfaces from effect tracking: @@ -395,105 +369,62 @@ This section is the **normative surface syntax** for these forms. SEP-0002 and S ```spore // trait: type interface trait Display { - fn show(self) -> Str + fn show(self) -> Str; } trait Serialize { - fn serialize(self) -> Bytes ! SerializeError + fn serialize(self) -> Bytes ! SerializeError; } // effect: external world operations effect Console { - fn println(msg: Str) -> () - fn read_line() -> Str ! IoError + fn println(msg: Str) -> (); + fn read_line() -> Str ! IoError; } // effect alias (uses | for union) -effect HttpClient = NetConnect | Clock -effect CLI = Console | FileRead | FileWrite | Env | Spawn | Exit +effect HttpClient = NetConnect | Clock; +effect CLI = Console | FileRead | FileWrite | Env | Spawn | Exit; ``` -Effect operations and handler binding are also centralized here. `perform` is a reserved keyword for effect-operation expressions. The settled handler grammar is `handler as (...) { ... }` for declarations and `handle { ... } with { ... }` for installation: the `with` block may contain inline effect arms via `on Effect.op(...) => ...` and/or named handler installations via `use HandlerName { ... }`. SEP-0003 defines the effect algebra and handler semantics; this SEP fixes the corresponding surface syntax. - -The `uses` clause declares what effects a function requires. The compiler **auto-infers** effect properties from this set: - -| `uses` set | Inferred properties | -|---|---| -| `uses []` (or omitted for pure) | `pure`, `deterministic`, `total` | -| `uses [FileRead]` | `deterministic` | -| Contains `Random` or `Clock` | neither `pure` nor `deterministic` | - -The `idempotent` property cannot be inferred and is annotated via doc comment: `/// @idempotent`. - -The `uses` clause can mix atomic effect names with named effect aliases: +SEP-0001 fixes the grammar for `trait`, `effect`, `handler`, `perform`, +`handle`, and `uses [...]`. SEP-0002 owns trait semantics; SEP-0003 owns effect +algebra, handler behavior, inferred effect attributes, and alias expansion. At the +syntax level, a `uses` clause is a bracketed list of effect names or aliases: ```spore fn query_database(sql: Str) -> Data ! DbError | Timeout uses [NetConnect] { - ... + ?query_database_body } fn run_cli(config_path: Str) -> () ! Error uses [CLI] { - ... + ?run_cli_body } ``` ### Behavioral specification (`spec` clause) -Spore functions can include a `spec` block that expresses behavioral intent as typechecked contracts. The `spec` clause sits after `where`, `uses`, and `cost`, immediately before the function body — making intent structurally part of the signature, visible to the compiler, surfaced in documentation, and available to testing and hole-reporting tooling. - -```spore -fn add(a: I64, b: I64) -> I64 -spec { - example "positive inputs": add(2, 3) == 5 - example "identity": add(0, 42) == 42 - example "commutativity": add(1, 2) == add(2, 1) -} -{ - a + b -} -``` - -**`example` items** are concrete labeled test cases. The body must evaluate to `Bool`: +A function signature may include a `spec` block after `where`, `cost`, and `uses`, immediately before the function body. SEP-0001 fixes only the placement and item syntax: ```spore -// Equality form (most common) -example "parses ISO date": parse_date("2024-01-15") == Ok(Date(2024, 1, 15)) - -// Boolean form -example "rejects empty string": parse_date("").is_err() - -// Multi-line form — last expression (without semicolon) is the result -example "handles leap year" { - let d = parse_date("2024-02-29"); - d == Ok(Date(2024, 2, 29)) -} -``` - -**`property` items** express universally quantified assertions using explicitly typed lambda syntax. The compiler generates random inputs for property-based testing: - -```spore -fn parse_date(s: Str) -> Result[Date, ParseError] ! ParseError +fn name(params) -> Return spec { - example "ISO 8601": parse_date("2024-01-15") == Ok(Date(2024, 1, 15)) - example "rejects ambiguous": parse_date("01/15/24") == Err(AmbiguousFormat) - - property "total": |s: Str| parse_date(s).is_ok() || parse_date(s).is_err() - property "round-trip": |d: Date| parse_date(d.to_iso_string()) == Ok(d) + example "label": expr + law "label": |x: T| expr + law "refined": |x: I32 when self >= 0| expr } { - ?parse_logic + body } ``` -Key design properties: - -- **Spec metadata survives hole bodies**: A function with `{ ?hole }` still retains its `spec` items for type checking, documentation, and HoleReport output. However, executing those spec items still calls the function body, so a hole remains a runtime error until the body is filled. -- **Current amendment scope**: This amendment defines `spec` for ordinary function declarations and `impl` methods with bodies. Trait-method `spec` syntax and inheritance semantics are deferred to a later amendment. -- **MissingSpec warning**: `pub` functions without a `spec` block emit a compiler warning (not error), encouraging behavioral documentation without forcing it. +`example` and `law` type checking, executable checking, diagnostics, and spec +hashing are specified in SEP-0006. Hole-report projection of `spec` metadata is +specified in SEP-0005. ### Struct and type definitions @@ -513,7 +444,7 @@ impl Display for User { impl Serialize for User { fn serialize(self) -> Str ! SerializeError { - ... + ?serialize_body } } @@ -530,10 +461,10 @@ type BinaryTree[T] = | Empty; // Refinement type -type PositiveInt = I64 if |n| n > 0; -type Percentage = F64 if |p| p >= 0.0 && p <= 100.0; -type I32 = I64 if |n| n >= -2147483648 && n <= 2147483647; -type U8 = I64 if |n| n >= 0 && n <= 255; +type PositiveInt = I64 when self > 0; +type Percentage = F64 when self >= 0.0 && self <= 100.0; +type I32 = I64 when self >= -2147483648 && self <= 2147483647; +type U8 = I64 when self >= 0 && self <= 255; ``` Field punning allows omitting the value when the variable name matches the field name, both in construction and pattern matching: @@ -583,6 +514,10 @@ let html = f"
"; ``` +Format strings (`f"..."`) and template strings (`t"..."`) both accept embedded +Spore expressions. Template-library behavior, escaping policy beyond the shared +string literal rules, and sanitization belong to SEP-0009. + ### Lambdas ```spore @@ -597,6 +532,9 @@ let complex = |x, y| { ### Concurrency +SEP-0001 fixes only the concurrency expression forms. Task lifetime, channel +behavior, scheduling, cancellation, and timeout semantics are owned by SEP-0007. + ```spore parallel_scope { let a = spawn { compute_a() }; @@ -605,55 +543,6 @@ parallel_scope { } ``` -#### Channel communication - -Channels are multi-producer, single-consumer (MPSC). Senders can be cloned; receivers cannot: - -```spore -let (tx, rx) = Channel.new[I64](buffer: 10); - -parallel_scope { - // Multiple producers via tx.clone() - let tx1 = tx.clone(); - let tx2 = tx.clone(); - - spawn { tx1.send(1) }; - spawn { tx2.send(2) }; - - spawn { - let a = rx.recv(); - let b = rx.recv(); - print(f"Received: {a}, {b}"); - }; -} -``` - -#### Select and timeout - -`select` multiplexes across channels. Use recursion for event loops (no `for`/`loop`): - -```spore -let (tx1, rx1) = Channel.new[I64](buffer: 1); -let (tx2, rx2) = Channel.new[Str](buffer: 1); - -// Recursive event loop (TCO guaranteed) -fn event_loop(rx1: Channel.Receiver[I64], rx2: Channel.Receiver[Str]) { - select { - value from rx1 => { - print(f"Got integer: {value}"); - }, - message from rx2 => { - print(f"Got string: {message}"); - }, - timeout 1000 => { - print("Timed out — stopping"); - return; - }, - } - event_loop(rx1, rx2) // tail recursion -} -``` - ### Modules and imports ```spore @@ -673,80 +562,86 @@ import std.math as math; Spore source files are UTF-8 encoded. Identifiers may contain Unicode letters, ASCII digits, and underscores. Identifiers must begin with a letter or underscore. -**Character literals.** Single-quoted `'_'` syntax is **not** part of the language. The reference compiler rejects it at lex time with `character literals are not supported` (see `spore` PR #113). Use a normal string literal for a single scalar (for example `"a"`, `"世"`); character-oriented helpers in `stdlib/char.sp` take `Str` values of length 1. +**Character literals.** Single-quoted character literals are not part of the grammar. Length-1 text is written as a `Str` literal, for example `"a"` or `"世"`. Character-oriented standard-library helpers are specified by SEP-0009. #### Keywords The complete reserved keyword table: -| Keyword | Purpose | -|---|---| -| `fn` | Function definition | -| `let` | Immutable binding | -| `if` / `else` | Conditional expression | -| `match` | Pattern matching expression | -| `struct` | Struct type definition | -| `type` | Type alias / sum type definition | -| `trait` | Type interface definition (trait) | -| `effect` | Effect definition / effect alias | -| `handler` | Effect handler implementation | -| `perform` | Invoke an effect operation | -| `impl` | Trait implementation block | -| `pub` / `pub(pkg)` | Visibility modifiers (public / package-visible) | -| `module` | Module definition | -| `import` | Module import | -| `alias` | Type alias | -| `where` | Generic type constraints | -| `uses` | Effect requirement declaration | -| `cost` | Four-slot cost vector clause | -| `spec` | Behavioral specification block in function declarations and `impl` methods | -| `example` | Concrete labeled example item inside a `spec` block | -| `property` | Universally quantified assertion inside a `spec` block | -| `spawn` | Spawn concurrent task | -| `select` | Channel multiplex expression | -| `parallel_scope` | Structured concurrency scope | -| `const` | Compile-time constant definition | -| `return` | Early return from function | -| `throw` | Throw an error value | -| `handle` | Bind effect handler expression | -| `with` | Handler binding (with handle) | -| `implements` | Reserved (use `impl ... for ...`) | -| `as` | Rename in imports | -| `in` | Reserved | -| `mut` | Reserved | -| `static` | Reserved | -| `async` / `await` | Async task operations | -| `move` | Reserved | -| `ref` | Reserved | -| `self` | Current instance in trait/handler methods (regular identifier) | -| `super` | Parent module reference | -| `crate` | Crate root reference | -| `enum` | Reserved (use `type` with variants) | -| `union` | Reserved | -| `unsafe` | Reserved | -| `extern` | Reserved | -| `macro` | Reserved | -| `mod` | Reserved (use `module`) | -| `true` / `false` | Boolean literals | -| `Some` / `None` | Option type constructors | -| `Ok` / `Err` | Result type constructors | -| `Result` / `Option` / `Ref` | Built-in parameterized types | +| Keyword | Purpose | +| --------------------------- | -------------------------------------------------------------------------- | +| `fn` | Function definition | +| `let` | Immutable binding | +| `if` / `else` | Conditional expression | +| `match` | Pattern matching expression | +| `struct` | Struct type definition | +| `type` | Type alias / sum type definition | +| `trait` | Type interface definition (trait) | +| `effect` | Effect definition / effect alias | +| `handler` | Effect handler implementation | +| `perform` | Invoke an effect operation | +| `impl` | Trait implementation block | +| `pub` / `pub(pkg)` | Visibility modifiers (public / package-visible) | +| `import` | Module import | +| `alias` | Type alias | +| `when` | Refinement predicate introducer | +| `where` | Generic type constraints | +| `uses` | Effect requirement declaration | +| `cost` | Four-slot cost vector clause | +| `spec` | Behavioral specification block in function declarations and `impl` methods | +| `example` | Concrete labeled example item inside a `spec` block | +| `law` | Universally quantified assertion inside a `spec` block | +| `spawn` | Spawn concurrent task | +| `select` | Channel multiplex expression | +| `parallel_scope` | Structured concurrency scope | +| `const` | Compile-time constant definition | +| `return` | Early return from function | +| `throw` | Throw an error value | +| `handle` | Bind effect handler expression | +| `with` | Handler binding (with handle) | +| `implements` | Reserved (use `impl ... for ...`) | +| `as` | Rename in imports | +| `in` | Reserved | +| `mut` | Reserved | +| `static` | Reserved | +| `async` | Reserved | +| `await` | Reserved spelling for postfix `.await` | +| `move` | Reserved | +| `ref` | Reserved | +| `self` | Current instance in trait/handler methods (regular identifier) | +| `super` | Parent module reference | +| `crate` | Crate root reference | +| `enum` | Reserved (use `type` with variants) | +| `union` | Reserved | +| `unsafe` | Reserved | +| `extern` | Reserved | +| `macro` | Reserved | +| `on` | Inline effect arm introducer inside `handle … with { … }` | +| `from` | Channel receive binder in `select` arms (`value from rx`) | +| `use` | Named handler binding inside `handle … with { … }` | +| `timeout` | Timeout arm in `select` expressions | +| `true` / `false` | Boolean literals | +| `Some` / `None` | Option type constructors | +| `Ok` / `Err` | Result type constructors | +| `Result` / `Option` / `Ref` | Built-in parameterized types | + +**Attributes.** Function-level attributes use the `@name(args)` form — a decorator-style prefix placed before `fn` (and before any doc comment). Examples: `@allows(validate, sanitize)`, `@allow(missing_spec)`. The `@` sigil is not a standalone operator; the full attribute form is parsed as part of `FunctionHeader`. #### Operators -| Category | Operators | Description | -|---|---|---| -| Arithmetic | `+` `-` `*` `/` `%` | Add, subtract, multiply, divide, modulo | -| Comparison | `==` `!=` `<` `>` `<=` `>=` | Equality and ordering | -| Logical | `&&` `\|\|` `!` | And, or, not | -| Bitwise | `&` `\|` `^` `~` `<<` `>>` | AND, OR, XOR, NOT, shifts | -| Pipe | `\|>` | Data-flow pipe | -| Error propagation | `?` | Propagate error from Result | -| Range | `..` `..=` | Half-open and closed range | -| Field access | `.` | Field / method access | -| Assignment | `=` | Binding assignment | -| Call | `()` | Function / method invocation | -| Index | `[]` | Index access and generic parameters | +| Category | Operators | Description | +| ----------------- | --------------------------- | --------------------------------------- | +| Arithmetic | `+` `-` `*` `/` `%` | Add, subtract, multiply, divide, modulo | +| Comparison | `==` `!=` `<` `>` `<=` `>=` | Equality and ordering | +| Logical | `&&` `\|\|` `!` | And, or, not | +| Bitwise | `&` `\|` `^` `~` `<<` `>>` | AND, OR, XOR, NOT, shifts | +| Pipe | `\|>` | Data-flow pipe | +| Error propagation | `?` | Propagate error from Result | +| Range | `..` `..=` | Half-open and closed range | +| Field access | `.` | Field / method access | +| Assignment | `=` | Binding assignment | +| Call | `()` | Function / method invocation | +| Index | `[]` | Index access and generic parameters | Spore has a **fixed operator set** — no custom operators are allowed. @@ -754,21 +649,21 @@ Spore has a **fixed operator set** — no custom operators are allowed. 13 levels, from highest to lowest: -| Level | Operators | Associativity | Description | -|---|---|---|---| -| 1 | `.` `()` `[]` | Left | Field access, call, index | -| 2 | `-` (unary) `!` `~` | Right (prefix) | Unary negation, logical not, bitwise not | -| 3 | `*` `/` `%` | Left | Multiplicative | -| 4 | `+` `-` | Left | Additive | -| 5 | `<<` `>>` | Left | Bit shift | -| 6 | `&` | Left | Bitwise AND | -| 7 | `^` | Left | Bitwise XOR | -| 8 | `\|` | Left | Bitwise OR | -| 9 | `==` `!=` `<` `>` `<=` `>=` | Non-associative | Comparison | -| 10 | `&&` | Left | Logical AND (short-circuit) | -| 11 | `\|\|` | Left | Logical OR (short-circuit) | -| 12 | `..` `..=` | Non-associative | Range | -| 13 | `\|>` | Left | Pipe | +| Level | Operators | Associativity | Description | +| ----- | --------------------------- | --------------- | ---------------------------------------- | +| 1 | `.` `()` `[]` | Left | Field access, call, index | +| 2 | `-` (unary) `!` `~` | Right (prefix) | Unary negation, logical not, bitwise not | +| 3 | `*` `/` `%` | Left | Multiplicative | +| 4 | `+` `-` | Left | Additive | +| 5 | `<<` `>>` | Left | Bit shift | +| 6 | `&` | Left | Bitwise AND | +| 7 | `^` | Left | Bitwise XOR | +| 8 | `\|` | Left | Bitwise OR | +| 9 | `==` `!=` `<` `>` `<=` `>=` | Non-associative | Comparison | +| 10 | `&&` | Left | Logical AND (short-circuit) | +| 11 | `\|\|` | Left | Logical OR (short-circuit) | +| 12 | `..` `..=` | Non-associative | Range | +| 13 | `\|>` | Left | Pipe | The error propagation operator `?` is a postfix operator that binds tighter than any binary operator — it applies immediately to the preceding expression. @@ -785,11 +680,14 @@ The assignment operator `=` appears only in `let` bindings and is not an express 0xFF // hexadecimal 0o755 // octal 0b1010_1100 // binary -42i32 // typed suffix (reserved — not accepted by the reference lexer yet) -100u64 // typed suffix (reserved — not accepted by the reference lexer yet) +42i32 // explicitly typed signed integer +255u8 // explicitly typed unsigned integer ``` -**Implementation note:** The reference lexer parses integer literals without suffixes and the type checker gives them a default numeric type (**`I64`** in `sporec-typeck`). F64 literals default to **`F64`**. +Unsuffixed integer literals default to **`I64`** unless a checking context fixes +another integer width. Suffixes are accepted for every fixed-width integer: +`i8`, `i16`, `i32`, `i64`, `u8`, `u16`, `u32`, and `u64`. Suffix overflow is a +type-checking error, not a separate grammar form. **Floats:** @@ -799,10 +697,11 @@ The assignment operator `=` appears only in `let` bindings and is not an express 1.0e-10 2.5e+3 1_000.5 -3.14f32 // typed suffix (reserved — not accepted by the reference lexer yet) +3.14f32 ``` -**Implementation note:** Unsuffixed floats become `F64` in `sporec-typeck`. +Unsuffixed floats default to **`F64`** unless a checking context fixes another +float width. Suffixes are accepted for `f32` and `f64`. **Booleans:** `true`, `false` @@ -817,7 +716,7 @@ t"Dear {customer}, order {id}" // template string #### Comments -```spore +````spore // Line comment /// Doc comment (supports Markdown) @@ -830,23 +729,23 @@ t"Dear {customer}, order {id}" // template string /* can be nested */ continues here */ -``` +```` #### Identifiers and naming conventions -| Convention | Used for | Examples | -|---|---|---| -| `snake_case` | Variables, functions, modules | `user_name`, `calculate_total`, `http_client` | -| `PascalCase` | Types, traits, effects, enum variants | `UserAccount`, `Serialize`, `Some` | -| `SCREAMING_SNAKE_CASE` | Constants | `MAX_BUFFER_SIZE`, `PI` | +| Convention | Used for | Examples | +| ---------------------- | ------------------------------------- | --------------------------------------------- | +| `snake_case` | Variables, functions, modules | `user_name`, `calculate_total`, `http_client` | +| `PascalCase` | Types, traits, effects, enum variants | `UserAccount`, `Serialize`, `Some` | +| `SCREAMING_SNAKE_CASE` | Constants | `MAX_BUFFER_SIZE`, `PI` | ### EBNF Grammar -For effect handling, the settled grammar is `perform `, top-level `handler as (...) { ... }`, and `handle { ... } with { ... }` where each item is either `use HandlerName { ... }` or `on Effect.op(...) => ...`. +Effect handling uses `perform `, top-level `handler as (...) { ... }`, and `handle { ... } with { ... }`. Handler-block entries are either `use HandlerName { ... }` or `on Effect.op(...) => ...`. ```ebnf (* ═══════════════════════════════════════════════════ *) -(* Spore v0.1 — Complete EBNF Grammar *) +(* Spore — Complete EBNF Grammar *) (* ═══════════════════════════════════════════════════ *) (* ─── Top-level ───────────────────────────────────── *) @@ -869,10 +768,10 @@ TopLevelItem = ImportDecl ImportDecl = "import" ImportPath [ "as" Ident ] ";" ; ImportPath = Ident { "." Ident } ; -AliasDecl = "alias" Ident "=" QualifiedIdent ";" ; +AliasDecl = [ Visibility ] "alias" Ident "=" QualifiedIdent ";" ; (* ─── Module ──────────────────────────────────────── *) -(* REMOVED (D7): Module names are derived from file paths (see SEP-0008). +(* Module names are derived from file paths (see SEP-0008). The `module` keyword is not part of the surface syntax. *) (* ─── Struct ──────────────────────────────────────── *) @@ -894,7 +793,7 @@ TypeDecl = [ Visibility ] "type" Ident [ TypeParams ] TypeDefBody = VariantList (* sum type *) | TypeExpr (* type alias *) - | TypeExpr "if" LambdaExpr ; (* refinement type *) + | TypeExpr "when" Expr ; (* refinement type; predicate may reference `self` *) VariantList = [ "|" ] Variant { "|" Variant } ; Variant = Ident [ "(" FieldList ")" ] @@ -903,11 +802,11 @@ Variant = Ident [ "(" FieldList ")" ] (* ─── Trait ────────────────────────────────────────── *) TraitDecl = [ Visibility ] "trait" Ident [ TypeParams ] - [ ":" BoundList ] + [ ":" BoundExpr ] "{" { TraitItem } "}" ; TraitItem = AssocType - | FunctionDecl ; (* may have default body *) + | TraitFunctionSig ( ";" | Block ) ; AssocType = "type" Ident ";" ; @@ -916,7 +815,8 @@ AssocType = "type" Ident ";" ; EffectDecl = [ Visibility ] "effect" Ident [ TypeParams ] ( EffectBlock | "=" EffectAlias ";" ) ; -EffectBlock = "{" { FunctionDecl } "}" ; +EffectBlock = "{" { EffectOpDecl } "}" ; +EffectOpDecl = FunctionHeader ";" ; EffectAlias = Ident { "|" Ident } ; @@ -947,15 +847,28 @@ ConstDecl = [ Visibility ] "const" Ident ":" TypeExpr "=" Expr ";" ; (* ─── Function Declaration ────────────────────────── *) -FunctionDecl = [ DocComment ] +FunctionDecl = FunctionSig Block ; + +FunctionSig = FunctionHeader [ SpecClause ] ; + +TraitFunctionSig = FunctionHeader [ SpecClause ] ; + +(* ─── Attributes ─────────────────────────────────── *) + +Attribute = "@" Ident [ "(" [ AttrArgList ] ")" ] ; +AttrArgList = AttrArg { "," AttrArg } [ "," ] ; +AttrArg = Ident | StringLiteral | IntLiteral ; + +(* ─── Function Declaration ────────────────────────── *) + +FunctionHeader = { Attribute } + [ DocComment ] [ Visibility ] "fn" Ident [ TypeParams ] "(" [ ParamList ] ")" [ "->" TypeExpr ] [ ErrorClause ] [ WhereClause ] - [ UsesClause ] [ CostClause ] - [ SpecClause ] - Block ; + [ UsesClause ] ; DocComment = { "///" CommentText } ; @@ -971,7 +884,8 @@ Param = Ident ":" TypeExpr ; ErrorClause = "!" TypeExpr { "|" TypeExpr } ; WhereClause = "where" WhereConstraint { "," WhereConstraint } ; -WhereConstraint = Ident ":" Ident ; +WhereConstraint = Ident ":" BoundExpr ; +BoundExpr = Ident { "+" Ident } ; UsesClause = "uses" "[" [ EffectList ] "]" ; EffectList = Ident { "," Ident } ; @@ -982,22 +896,23 @@ CostExpr = IntLiteral | CostExpr "*" CostExpr | CostExpr "+" CostExpr | FunctionCall ; (* e.g., log2(n) *) +FunctionCall = Ident "(" [ ArgList ] ")" ; (* ─── Spec Clause ─────────────────────────────────── *) SpecClause = "spec" "{" { SpecItem } "}" ; SpecItem = ExampleItem - | PropertyItem ; + | LawItem ; ExampleItem = "example" StringLiteral ":" Expr | "example" StringLiteral Block ; -PropertyItem = "property" StringLiteral ":" PropertyLambdaExpr ; +LawItem = "law" StringLiteral ":" LawLambdaExpr ; -PropertyLambdaExpr = "|" [ PropertyParamList ] "|" ( Expr | Block ) ; -PropertyParamList = PropertyParam { "," PropertyParam } ; -PropertyParam = Ident ":" TypeExpr ; +LawLambdaExpr = "|" [ LawParamList ] "|" ( Expr | Block ) ; +LawParamList = LawParam { "," LawParam } [ "," ] ; +LawParam = Ident ":" TypeExpr [ "when" Expr ] ; (* ─── Type Expressions ────────────────────────────── *) @@ -1047,7 +962,7 @@ UnaryExpr = ( "-" | "!" | "~" ) UnaryExpr | PostfixExpr ; PostfixExpr = PrimaryExpr { PostfixOp } ; -PostfixOp = "?" | "." Ident | "." Ident "(" [ ArgList ] ")" +PostfixOp = "?" | "." "await" | "." Ident | "." Ident "(" [ ArgList ] ")" | "(" [ ArgList ] ")" | "[" Expr "]" | "[" [ Expr ] ".." [ Expr ] "]" | "[" [ Expr ] "..=" Expr "]" ; @@ -1080,8 +995,8 @@ ThrowExpr = "throw" Expr ; SpawnExpr = "spawn" Block ; SelectExpr = "select" "{" { SelectArm } "}" ; -SelectArm = Ident "from" Expr "=>" Block "," - | "timeout" IntLiteral "=>" Block "," ; +SelectArm = Ident "from" Expr "=>" Expr "," + | "timeout" "(" Expr ")" "=>" Expr "," ; ParallelScopeExpr = "parallel_scope" Block ; @@ -1142,12 +1057,13 @@ HoleExpr = "?" Literal = IntLiteral | FloatLiteral | BoolLiteral | StringLiteral ; -IntLiteral = DecimalInt | HexInt | OctalInt | BinaryInt ; -DecimalInt = [ "-" ] Digit { Digit | "_" } [ IntSuffix ] ; -HexInt = "0x" HexDigit { HexDigit | "_" } [ IntSuffix ] ; -OctalInt = "0o" OctDigit { OctDigit | "_" } [ IntSuffix ] ; -BinaryInt = "0b" BinDigit { BinDigit | "_" } [ IntSuffix ] ; -IntSuffix = "i32" | "i64" | "u8" | "u32" | "u64" ; +IntLiteral = ( DecimalInt | HexInt | OctalInt | BinaryInt ) [ IntSuffix ] ; +DecimalInt = [ "-" ] Digit { Digit | "_" } ; +HexInt = "0x" HexDigit { HexDigit | "_" } ; +OctalInt = "0o" OctDigit { OctDigit | "_" } ; +BinaryInt = "0b" BinDigit { BinDigit | "_" } ; +IntSuffix = "i8" | "i16" | "i32" | "i64" + | "u8" | "u16" | "u32" | "u64" ; FloatLiteral = [ "-" ] Digit { Digit | "_" } "." Digit { Digit | "_" } [ ( "e" | "E" ) [ "+" | "-" ] Digit { Digit } ] @@ -1161,71 +1077,49 @@ StringLiteral = '"' { StringChar } '"' | 'f"' { FStringPart } '"' | 't"' { TStringPart } '"' ; -FStringPart = StringChar | "{" Expr "}" ; -TStringPart = StringChar | "{" Ident "}" ; +FStringPart = FStringText | "{" Expr "}" ; +TStringPart = FStringText | "{" Expr "}" ; EscapeSeq = "\\" ( "n" | "t" | "r" | "\\" | '"' | "0" | "u{" HexDigit { HexDigit } "}" ) ; +StringChar = EscapeSeq | ? any Unicode scalar except `"` or `\` ? ; +FStringText = EscapeSeq | ? any Unicode scalar except `"`, `\`, `{`, or `}` ? ; +RawChar = ? any Unicode scalar except `"` ? ; +CommentText = ? any text until line end ? ; (* ─── Identifier ──────────────────────────────────── *) Ident = ( Letter | "_" ) { Letter | Digit | "_" } ; -Letter = "a".."z" | "A".."Z" | UnicodeLetterExceptAscii ; +Letter = "a".."z" | "A".."Z" | UnicodeLetter ; +UnicodeLetter = ? any Unicode scalar with the Unicode Letter property ? ; Digit = "0".."9" ; HexDigit = Digit | "a".."f" | "A".."F" ; OctDigit = "0".."7" ; BinDigit = "0" | "1" ; ``` -### Function signature semantics +### Function signature layout -The function signature system is the heart of Spore's design. The clauses appear in this canonical order: +Function clauses appear in this canonical order: ```text fn []() -> [! ] [where : , ...] -[uses [, ...]] [cost [, , , ]] -[spec { }] +[uses [, ...]] +[spec { }] { } ``` -**Clause semantics:** - -1. **`-> ReturnType`** — The return type. Omitted for functions returning `Unit`. - -2. **`! ErrorTypes`** — The error set. A function with `! E1 | E2` may produce - errors of type `E1` or `E2`. Absence of `!` means the function cannot fail. - Compatibility and `?` propagation use **canonicalized** error-set comparison: - resolve each written item to its canonical nominal identity, drop duplicates, - and compare by canonical subset/equivalence. Duplicate or canonically - equivalent items are redundant and should be diagnosed, even though signature - hashing remains conservative over the written surface form. - -3. **`where T: Bound, U: Bound`** — Generic constraints. The canonical form is a single `where` clause with comma-separated constraints. Spore does not use `+` multi-bound syntax. - -4. **`uses [Effects]`** — The effect set required by this function. The compiler auto-infers effect properties: - - `uses []` → `pure`, `deterministic`, `total` - - `uses [FileRead]` → `deterministic` - - If `uses` contains `Random` or `Clock` → not deterministic - - `total` is inferred by the compiler's termination checker - - **Implication chain:** `pure` ⊃ `deterministic` — a pure function is necessarily deterministic (same inputs always produce same outputs). A deterministic function is not necessarily pure (it may perform IO that doesn't introduce non-determinism). `total` is orthogonal: a function can be total without being pure. - -5. **`cost [c, a, i, p]`** — A four-slot cost vector (`compute`, `alloc`, `io`, `parallel`). Each slot may reference parameters or use the currently supported linear `O(n)` forms. - -6. **`spec { ... }`** — Behavioral specification block. Contains `example` and `property` items that express the function's intended behavior as typechecked test metadata. When the enclosing function body is executable, `spore test` evaluates the `spec` block by calling the function by name. If the body still contains an unfilled hole, the `spec` block remains available to the compiler and HoleReport output, but executing it still reaches the hole and errors. - - - **`example "label": expr`** — A concrete named test case. The body expression must be `Bool`. If using `==`, both sides must have the same type. Multi-line examples use block syntax: `example "label" { ... }`. - - **`property "label": |x: T, ...| expr`** — A universally quantified assertion. The compiler generates random inputs of the declared types and verifies the body evaluates to `true` for all trials. Type annotations on property parameters are required. +SEP-0001 owns this order and the delimiter forms. Clause meaning is delegated: +return and error typing to SEP-0002, cost checking to SEP-0004, effect checking +to SEP-0003, `spec` execution and diagnostics to SEP-0006, and any +hole-specific projection of signature metadata to SEP-0005. - For `pub` functions without a `spec` block, the compiler emits a `MissingSpec` warning (not error). Private functions do not trigger this warning. Suppress with `#[allow(missing_spec)]`. - - This amendment does not define trait-method `spec` inheritance or contract merging; that behavior is reserved for a future SEP. - -**Clause ordering** is not enforced by the grammar but is enforced by the formatter: `where` → `uses` → `cost` → `spec`. +**Clause ordering** is the canonical source and formatter order: `where` → +`cost` → `uses` → `spec`. **`self` in trait and handler methods:** The `self` identifier is a regular parameter name used as the first parameter in trait method signatures. It is not a keyword with special scoping rules. @@ -1245,341 +1139,153 @@ trait Collection { ### Expression forms -All control flow constructs are expressions that produce values. - -**Block expressions** evaluate to their last expression (without semicolon). A block ending with a semicolon-terminated statement returns `Unit`. +This section summarizes syntax only. Expression typing, evaluation order, +exhaustiveness checking, closure capture, lowering, and error propagation +semantics are delegated to SEP-0002 and SEP-0006. -**If expressions** always have a value. When used as a statement (followed by `;`), the value is discarded. +**Blocks** use `{ ... }` with semicolon-terminated statements and an optional +tail expression. -**Match expressions** are exhaustive — the compiler rejects non-exhaustive matches at compile time. +**If expressions** use `if Expr Block [else (IfExpr | Block)]`. -**Lambda expressions** use `|params| body` syntax. Closures capture variables from the enclosing scope. +**Match expressions** use `match Expr { Pattern [if Expr] => Expr, ... }`. -**Pipe expressions** desugar as follows: +**Lambda expressions** use `|params| expr` or `|params| { ... }`. -- `x |> f` → `f(x)` -- `x |> f(y, z)` → `f(x, y, z)` -- `x |> f(_, y)` → `f(x, y)` -- `x |> .method()` → `x.method()` +**Pipe expressions** use `left |> right`. -**Error propagation (`?`)**: `expr?` evaluates `expr`. If the result is `Ok(v)`, -yields `v`. If `Err(e)`, immediately returns `Err(e)` from the enclosing -function. This check is performed by canonical subset comparison: the callee's -canonical error set must be a subset of the enclosing function's canonical -declared error set. The first slice keeps the current call/pipeline-oriented -checker architecture; it does not require a richer expression-local `Try` model -to land first. +**Try expressions** use postfix `?`. ### Pattern matching details Supported pattern forms: -| Pattern | Syntax | Example | -|---|---|---| -| Literal | `42`, `"hello"`, `true` | `0 => "zero"` | -| Variable binding | `name` | `Some(x) => x` | -| Wildcard | `_` | `_ => "other"` | -| Constructor | `Ctor(p1, p2)` | `Some(value) => value` | -| Struct | `Name { f1, f2, .. }` | `Point { x, y } => x + y` | -| List (empty) | `[]` | `[] => "empty"` | -| List (head..tail) | `[h, ..t]` | `[head, ..tail] => head` | -| List (exact) | `[a, b]` | `[a, b] => a + b` | -| Or | `p1 \| p2` | `"Sat" \| "Sun" => true` | -| Guard | `p if cond` | `n if n > 0 => "positive"` | -| Nested | `Ok(Some(x))` | `Ok(Some(v)) => v` | +| Pattern | Syntax | Example | +| ----------------- | ----------------------- | -------------------------- | +| Literal | `42`, `"hello"`, `true` | `0 => "zero"` | +| Variable binding | `name` | `Some(x) => x` | +| Wildcard | `_` | `_ => "other"` | +| Constructor | `Ctor(p1, p2)` | `Some(value) => value` | +| Struct | `Name { f1, f2, .. }` | `Point { x, y } => x + y` | +| List (empty) | `[]` | `[] => "empty"` | +| List (head..tail) | `[h, ..t]` | `[head, ..tail] => head` | +| List (exact) | `[a, b]` | `[a, b] => a + b` | +| Or | `p1 \| p2` | `"Sat" \| "Sun" => true` | +| Guard | `p if cond` | `n if n > 0 => "positive"` | +| Nested | `Ok(Some(x))` | `Ok(Some(v)) => v` | ### Type system -**Primitive types:** Fixed-width numerics `I8`, `I16`, `I32`, `I64`, `U8`, `U16`, `U32`, `U64`, `F32`, and `F64`; `Bool`; UTF-8 text as `Str`. There is **no** separate `Char` type. **Implementation note:** The reference compiler’s `Ty` enum in `sporec-typeck` uses exactly those numeric widths (plus `Str`, `Bool`, `Unit`, `Never`, tuples, refinements, …); **unsuffixed integer literals default to `I64`** and float literals to **`F64`**. `Int` and `Float` are not primitive names or authoritative shorthand in the language docs. Narrower ranges are expressed with **refinement types** on a fixed-width base (for example `alias Port = I64 when self >= 1 && self <= 65535`). +This subsection is a syntax-facing overview. SEP-0002 is authoritative for primitive type meaning, inference, refinement checking, trait resolution, and type display. -**Collection types:** **`List[T]`** (default—always unbounded); **`Vec[T, max: N]`** (bounded, **rarely needed**—only when `N` belongs in the type; see SEP-0002); **`Map[K, V]`**, **`Set[T]`**, **`Array[T, N]`** +**Primitive type names:** `I8`, `I16`, `I32`, `I64`, `U8`, `U16`, `U32`, `U64`, `F32`, `F64`, `Bool`, `Str`, `Unit`, and `Never`. + +**Collection type syntax:** `List[T]`, `Vec[T, max: N]`, `Map[K, V]`, `Set[T]`, and `Array[T, N]`. **Special types:** `Option[T]`, `Result[T, E]`, `Ref[T]`, `Channel[T]`, `Unit` **Const generics:** `struct Array[T, const N: I64] { ... }` ```spore -// Fixed-size array struct Array[T, const N: I64] { - data: List[T], // length guaranteed to be N + data: List[T], } -// Fixed-size matrix struct Matrix[T, const ROWS: I64, const COLS: I64] { data: Array[Array[T, COLS], ROWS], } - -// Usage -let vec3: Array[F64, 3] = Array.new([1.0, 2.0, 3.0]); -let identity: Matrix[F64, 3, 3] = Matrix.identity(); ``` -**Refinement types:** `type PositiveInt = I64 if |n| n > 0;` +Const-generic checking and collection invariants are specified by SEP-0002 and +the relevant standard-library SEP. + +**Refinement types:** `type PositiveInt = I64 when self > 0;` **Function types:** `fn(I64, I64) -> I64` ### Concurrency primitives -Spore provides structured concurrency: - -- **`parallel_scope { ... }`** — All spawned tasks must complete before the scope exits. -- **`spawn { expr }`** — Launch a concurrent task within a parallel scope. -- **`select { ... }`** — Multiplex over multiple channels. -- **`task.await`** — Wait for a spawned task's result. -- **`Channel.new[T](buffer: N)`** — Create a bounded channel. - -### Hole syntax - -Holes (`?`, `?name`, `?name : Type`) enable incremental development. A function with holes type-checks but cannot be compiled for real execution — only simulated. The compiler can suggest implementations based on available bindings and the `@allows` annotation. +Concurrency forms: -#### `@allows` annotation +- `parallel_scope { ... }` +- `spawn { expr }` +- `select { ... }` +- `task.await` -The `@allows` annotation constrains which functions the compiler (or an AI agent) may use to fill a hole: +SEP-0007 owns the semantics of these forms. -```spore -@allows[validate, sanitize, format] -fn process_input(raw: Str) -> Str ! ValidationError { - let validated = validate(raw)?; - let sanitized = sanitize(validated); - ?final_step // this hole can only call validate/sanitize/format -} - -@allows[add, multiply, negate] -fn arithmetic(a: I64, b: I64) -> I64 { - let x = ?step1 : I64; // only add/multiply/negate - let y = ?step2 : I64; // same constraint - x + y -} -``` - -#### Hole + pipeline type inference +### Hole syntax -The compiler infers hole types through pipeline chains: +Holes are expressions: ```spore -fn example() { - let list = [1, 2, 3, 4, 5]; - let result = list - |> filter(?) // hole: fn(I64) -> Bool - |> map(?) // hole: fn(I64) -> ?R - |> fold(0, ?); // hole: fn(I64, ?R) -> I64 -} +? +?name +?name : Type ``` -#### Hole protocol reference - -`sporec query-hole --json` returns one hole object from the shared typed-hole protocol, and `sporec holes --json` returns the batch form with both `holes` and `dependency_graph`. SEP-0005 is the authoritative schema and workflow specification; SEP-0001 intentionally avoids freezing a second inline JSON shape here. - -At minimum, the shared hole object includes: - -- `name` / `display_name` -- `location` -- `expected_type` / `type_inferred_from` -- `function` / `enclosing_signature` -- `bindings` / `binding_dependencies` -- `effects`, `errors_to_handle`, `cost_budget` -- `candidates`, `dependent_holes`, `confidence`, `error_clusters` - -When a function has both a `spec` block and a hole body, that shared hole object may surface the `spec` items as additional behavioral context for tooling. The authoritative transport and evolution rules remain in SEP-0005 so SEP-0001 does not accidentally freeze a stale field layout. +SEP-0001 owns these spellings only. Partial-function status, hole typing, +pipeline inference, `@allows`, JSON output, and any `spec` metadata surfaced to +hole tooling are owned by SEP-0005. ## Complete examples -### Expression parser and evaluator - -This example demonstrates algebraic data types, pattern matching, recursion, error handling, and four-slot cost clauses working together: +Complete semantic examples are intentionally deferred to the SEPs that own the +corresponding behavior. ```spore -// Expression AST type Expr = Literal(I64) - | Variable(name: Str) - | BinOp(op: Op, left: Expr, right: Expr) - | UnaryOp(op: UnaryOp, expr: Expr) - | Let(name: Str, value: Expr, body: Expr) - | If(condition: Expr, then_branch: Expr, else_branch: Expr); - -type Op = Add | Sub | Mul | Div | Equal | LessThan; -type UnaryOp = Negate | Not; - -type Env = Map[Str, I64]; + | Add(left: Expr, right: Expr); -type EvalError = - UndefinedVariable(name: Str) - | DivisionByZero - | TypeError(message: Str); - -// Evaluator — pure recursion, no loops -fn eval(expr: Expr, env: Env) -> I64 ! EvalError -cost [expr_size(expr) * 10, expr_size(expr), 0, 1] -{ +fn eval(expr: Expr) -> I64 { match expr { Literal(n) => n, - - Variable(name) => match env.get(name) { - Some(value) => value, - None => throw EvalError.UndefinedVariable(name), - }, - - BinOp(op, left, right) => { - let left_val = eval(left, env)?; - let right_val = eval(right, env)?; - eval_binop(op, left_val, right_val)? - }, - - UnaryOp(op, e) => { - let val = eval(e, env)?; - match op { - Negate => -val, - Not => if val == 0 { 1 } else { 0 }, - } - }, - - Let(name, value_expr, body) => { - let value = eval(value_expr, env)?; - let new_env = env.insert(name, value); - eval(body, new_env)? - }, - - If(cond, then_branch, else_branch) => { - let cond_val = eval(cond, env)?; - if cond_val != 0 { - eval(then_branch, env)? - } else { - eval(else_branch, env)? - } - }, - } -} - -fn eval_binop(op: Op, left: I64, right: I64) -> I64 ! EvalError { - match op { - Add => left + right, - Sub => left - right, - Mul => left * right, - Div => if right == 0 { - throw EvalError.DivisionByZero - } else { - left / right - }, - Equal => if left == right { 1 } else { 0 }, - LessThan => if left < right { 1 } else { 0 }, - } -} - -// Usage -fn example() { - // let x = 10 in let y = 20 in x + y - let expr = Expr.Let( - "x", - Expr.Literal(10), - Expr.Let( - "y", - Expr.Literal(20), - Expr.BinOp(Op.Add, Expr.Variable("x"), Expr.Variable("y")) - ) - ); - - match eval(expr, Map.empty()) { - Ok(result) => print(f"Result: {result}"), // Result: 30 - Err(e) => print(f"Error: {e}"), + Add(left, right) => eval(left) + eval(right), } } ``` -### Concurrent producer-consumer - -This example demonstrates channels, structured concurrency, tail-recursive message processing, and zero use of loop constructs: - -```spore -type Task = - Process(id: I64, data: Str) - | Stop; - -// Producer generates tasks and sends them to a channel -fn producer(tx: Channel.Sender[Task], task_count: I64) { - (1..=task_count) - .map(|i| Task.Process(i, f"Task data {i}")) - .for_each(|task| tx.send(task)); - tx.send(Task.Stop); -} - -// Consumer processes tasks via tail recursion -fn consumer(id: I64, rx: Channel.Receiver[Task], result_tx: Channel.Sender[Str]) { - fn process(id: I64, rx: Channel.Receiver[Task], result_tx: Channel.Sender[Str]) { - match rx.recv() { - Task.Process(task_id, data) => { - let result = f"Consumer {id} processed task {task_id}: {data}"; - result_tx.send(result); - process(id, rx, result_tx) // tail recursion - }, - Task.Stop => {}, - } - } - process(id, rx, result_tx) -} - -// Result collector using tail recursion -fn collector(rx: Channel.Receiver[Str], expected: I64) { - fn collect(rx: Channel.Receiver[Str], remaining: I64) { - if remaining <= 0 { return } - let result = rx.recv(); - print(result); - collect(rx, remaining - 1) // tail recursion - } - collect(rx, expected) -} - -fn main() { - let task_count = 10; - let consumer_count = 3; - let (task_tx, task_rx) = Channel.new[Task](buffer: 5); - let (result_tx, result_rx) = Channel.new[Str](buffer: 10); - - parallel_scope { - spawn { producer(task_tx, task_count) }; - - (1..=consumer_count).for_each(|i| { - let rx_clone = task_rx.clone(); - let tx_clone = result_tx.clone(); - spawn { consumer(i, rx_clone, tx_clone) }; - }); - - spawn { collector(result_rx, task_count) }; - } - - print("All tasks completed!"); -} -``` +More complete examples for type checking, effects, cost, holes, concurrency, +module resolution, and standard-library APIs belong to SEP-0002 through SEP-0009. ## Human experience impact ### Readability -Spore's syntax prioritizes scannability. The fixed operator set means readers never encounter unfamiliar symbols. The pipe operator linearizes deeply nested function calls. Exhaustive `match` prevents overlooked cases. The absence of loops initially surprises imperative programmers, but the combination of `map`/`fold`/`filter` with guaranteed TCO quickly becomes natural. The `uses` clause makes side effects visible at a glance in the function signature. +Spore's syntax prioritizes scannability. The fixed operator set prevents +source-local operator definitions from changing parsing rules. The pipe operator +linearizes deeply nested function calls. The `uses` clause makes side effects +visible in the function signature. Exhaustiveness requirements for `match` are +specified by SEP-0002. ### Learning curve -- **Developers from Rust/TypeScript**: Very low barrier. Curly braces, semicolons, `let`, `match`, `fn`, and generic syntax are immediately familiar. The main novelty is `uses`/`cost`/`!` clauses and the no-loops philosophy. -- **Developers from Python/JavaScript**: Moderate curve. Static types and explicit error handling require adjustment, but f-strings, lambdas, and pipe operators feel comfortable. -- **Developers from Haskell/OCaml**: Low barrier. Pattern matching, ADTs, and expression-based design are familiar. Curly braces instead of significant whitespace is a stylistic shift. +- **Developers from Rust/TypeScript**: Curly braces, semicolons, `let`, `match`, `fn`, and explicit generic syntax are familiar. The main additional forms are `uses`, `cost`, and `!` clauses. +- **Developers from Python/JavaScript**: Static types and explicit error sets require adjustment. F-strings, lambdas, and pipe-style composition remain familiar entry points. +- **Developers from Haskell/OCaml**: Pattern matching, ADTs, and expression-oriented syntax are familiar. Curly braces replace significant whitespace. ### Progressive disclosure -Simple functions require zero ceremony — `fn add(a: I64, b: I64) -> I64 { a + b }` has no clauses to learn. Effects, error sets, and the `cost [...]` clause are introduced only when the code actually needs them. +Simple functions do not require optional clauses: +`fn add(a: I64, b: I64) -> I64 { a + b }`. Effects, error sets, `cost [...]`, +and `spec { ... }` are introduced only when the function signature requires +that metadata. ## Agent experience impact -Spore is designed with AI code generation agents as first-class consumers: +Code-generation agents are an explicit tooling audience for the syntax: ### Parsing predictability - **Fixed operator set**: Agents never need to resolve custom operators or precedence ambiguities. - **Regular grammar**: No significant whitespace, no optional semicolons, no implicit returns outside blocks. Every syntactic form is delimited by explicit tokens. -- **Keyword-introduced clauses**: `where`, `uses`, `cost`, `!` are unambiguous clause starters. +- **Keyword-introduced clauses**: `where`, `cost`, `uses`, `!` are unambiguous clause starters. ### Structured metadata -Function signatures are machine-readable contracts. An agent can extract: +Function signatures are machine-readable contracts. SEP-0001 defines stable +positions and delimiters for: - **Input/output types** from the parameter list and return type - **Failure modes** from the `! ErrorSet` @@ -1587,48 +1293,19 @@ Function signatures are machine-readable contracts. An agent can extract: - **Performance budget** from the `cost [compute, alloc, io, parallel]` clause - **Generic requirements** from `where` clauses -This enables agents to: - -1. Select appropriate functions by matching effect requirements -2. Estimate execution cost before generating call sequences -3. Verify error handling completeness -4. Compose pipelines with compatible effect sets +How agents interpret those clauses is owned by the type, effect, cost, hole, and +compiler SEPs. ### Code generation -Agents can generate Spore code incrementally using holes (`?`). A function can be specified by its signature alone, then filled in step by step. The `@allows` annotation restricts the search space for hole completion. +Agents can generate Spore code incrementally using holes (`?`). SEP-0005 owns the +hole-filling workflow and related annotations. ### Snapshot hashing -Any change to the following signature components produces a new snapshot hash and requires explicit `--permit` approval: - -| Component | Example change | -|---|---| -| Function name | `parse_config` → `load_config` | -| Parameter names | `raw` → `input` | -| Parameter order | `(a, b)` → `(b, a)` | -| Parameter types | `Str` → `Bytes` | -| Return type | `Config` → `Settings` | -| Error clause surface | add/remove/reorder/rename/duplicate any written error item | -| Cost slots | e.g. `[200, 20, 0, 1]` → `[300, 40, 50, 2]` | -| Effect set | add/remove any effect | -| Generic constraints | `T: Eq` → `T: Eq + Hash` | - -The `spec` block is tracked via a separate **spec hash**, independent of the snapshot hash: - -| Change | Snapshot hash | Spec hash | Required approval | -|--------|--------------|-----------|------------------| -| Add a `spec` block where none existed | unchanged | new | `--permit-spec` | -| Add/modify/remove an `example` or `property` | unchanged | changed | `--permit-spec` | -| Any signature clause change (types, effects, cost, etc.) | changed | unchanged | `--permit` | - -The lighter `--permit-spec` approval reflects that spec changes clarify behavioral contracts without altering calling conventions. Downstream callers need only re-run their own tests, not update call sites. - -For error clauses, this policy is intentionally **conservative**. Canonically -equivalent declarations such as reordered items or duplicate spellings are -treated as the same semantic error set for checking and diagnostics, but they -still change the snapshot hash until a later compatibility policy explicitly -relaxes that rule. +SEP-0001 defines the surface components that can participate in hashing. +Snapshot hashing, spec hashing, and approval flags are specified in SEP-0006 and +SEP-0008. ## Structured representation / protocol impact @@ -1657,11 +1334,7 @@ Program ├── cost? ├── spec? │ ├── examples[] - │ │ ├── label: Str - │ │ └── body: Expr - │ └── properties[] - │ ├── label: Str - │ └── predicate: LambdaExpr + │ └── laws[] └── body: Block Expr @@ -1694,10 +1367,10 @@ Expr The regular, keyword-delimited syntax supports straightforward LSP provider implementations: - **Go to definition**: Every identifier resolves unambiguously through module paths and lexical scoping. -- **Hover information**: Function signatures (including `uses`, `cost`, `spec` summary, inferred properties) can be displayed in full. The `spec` block is surfaced as a "behavioral contract" section showing example count and property count. -- **Completions**: After `|>`, suggest functions whose first parameter matches the pipe source type. Inside `uses [...]`, suggest available effects. Inside `spec { }`, suggest `example` and `property` as the only valid item keywords. -- **Diagnostics**: Missing match arms, undeclared effects, cost violations, incomplete error handling, missing spec on public functions, and spec failures can all be reported as structured diagnostics. -- **Signature help**: The ordered clause structure (`where` → `uses` → `cost` → `spec`) enables progressive parameter info display. +- **Hover information**: Function signatures can be displayed in full because clause order is fixed. +- **Completions**: Keyword-delimited positions such as `uses [...]` and `spec { ... }` are easy to identify. +- **Diagnostics**: Parser and formatter diagnostics can point to exact clause positions; semantic diagnostics are owned by dependent SEPs. +- **Signature help**: The ordered clause structure (`where` → `cost` → `uses` → `spec`) enables progressive parameter info display. ### Serialization @@ -1705,115 +1378,23 @@ The AST serializes naturally to JSON, S-expressions, or any tree-structured form ## Diagnostics impact -### Error messages - -Spore's explicit syntax enables highly specific error messages: - -**Missing effect:** - -```text -error[E0301]: function `fetch_data` uses effect `NetConnect` but does not declare it - --> src/api.sp:12:5 - | -12 | fn fetch_data(url: Url) -> Data ! NetworkError { - | ^^^^^^^^^^ missing `uses` clause - | - = help: add `uses [NetConnect]` to the function signature - = note: detected effect dependency via call to `http.get` -``` - -**Non-exhaustive match:** - -```text -error[E0401]: non-exhaustive match expression - --> src/main.sp:25:5 - | -25 | match color { - | ^^^^^ missing variant `Blue` - | - = help: add `Blue => ...` arm or use `_ => ...` wildcard -``` - -**Loop keyword used:** - -```text -error[E0101]: `for` loops are not supported in Spore - --> src/main.sp:10:5 - | -10 | for x in list { - | ^^^ Spore uses recursion and higher-order functions instead of loops - | - = help: try `list.map(|x| ...)` or `list.fold(init, |acc, x| ...)` - = help: for recursive iteration, use tail recursion (TCO is guaranteed) -``` - -**Spec clause in wrong position:** - -```text -error[E0501]: `spec` must appear after all other signature clauses - --> src/lib.sp:5:1 - | - 5 | spec { ... } - 6 | cost [500, 30, 50, 2] - | - = help: move `cost` before `spec` -``` - -**Spec example failure (runtime):** - -```text -spec failure: `parse_date` — example "ISO 8601" - --> src/dates.sp:5:5 - | - 5 | example "ISO 8601": parse_date("2024-01-15") == Ok(Date(2024, 1, 15)) - | - body evaluated to: false - = note: equality-shaped examples may additionally report compared values when available -``` - -**Spec property counterexample (runtime):** - -```text -spec counterexample: `parse_date` — property "round-trip" - --> src/dates.sp:9:5 - | - 9 | property "round-trip": |d: Date| parse_date(d.to_iso_string()) == Ok(d) - | - counterexample: d = Date { year: 2024, month: 2, day: 30 } - body evaluated to: false - = note: found after 47 trials -``` - -**Missing spec warning:** - -```text -warning[W0501]: public function `parse_date` has no `spec` block - --> src/dates.sp:3:1 - | - 3 | pub fn parse_date(s: Str) -> Result[Date, ParseError] ! ParseError { - | ^^^^^^^^^^ no behavioral contract declared - | - = help: add a `spec { example "...": ... }` block before the function body - = note: suppress with `#[allow(missing_spec)]` if intentional -``` - -### Recovery strategies - -The parser can recover from common errors: - -1. **Missing semicolon**: Insert one after a statement and continue parsing. -2. **Missing closing brace**: Match indentation heuristics to suggest where the brace should go. -3. **Unknown keyword**: Suggest the closest valid keyword (e.g., `func` → `fn`, `for` → `map`/`fold`). -4. **Angle brackets for generics**: Detect `List` and suggest `List[I64]`. -5. **Missing `!` clause**: When a function body calls a failable function without `?`, suggest adding the error type to the signature. +SEP-0001 enables precise syntax diagnostics because blocks, clauses, and +operators have fixed delimiters and a fixed order. Diagnostic codes, recovery +strategy, semantic errors, `spec` failures, and JSON diagnostic shape are owned +by SEP-0006. ## Drawbacks -1. **No loops**: Developers with imperative backgrounds may find the no-loop constraint frustrating initially, especially for simple counting patterns. The learning curve for converting loop-based thinking to recursion + HOFs is non-trivial. +1. **No loops**: Developers with imperative backgrounds must express iteration + through recursion or library functions, including simple counting patterns. -2. **Verbose error sets**: Functions that can fail in many ways produce long `! E1 | E2 | E3 | ...` lists. This is deliberate (errors are visible) but can clutter signatures. +2. **Verbose error sets**: Functions that can fail in many ways produce long + `! E1 | E2 | E3 | ...` lists. This makes failures visible but can lengthen + signatures. -3. **`cost` clause expressiveness**: The cost model is inherently limited — expressing accurate cost bounds for complex algorithms may be impractical. Wrong bounds are worse than no bounds. +3. **`cost` clause expressiveness**: The cost expression envelope may be + insufficient for complex algorithms. SEP-0004 owns the verification model and + diagnostics for inaccurate bounds. 4. **No custom operators**: Domains like linear algebra or DSLs sometimes benefit from custom operators. Spore sacrifices this for predictability. @@ -1821,33 +1402,37 @@ The parser can recover from common errors: 6. **Keyword count**: The reserved keyword set is large (40+), which constrains identifier names and increases the language surface area. -7. **`with` removal**: Auto-inferring properties from `uses` is convenient but means developers cannot see or override the inferred properties at the declaration site without inspecting compiler output. +7. **No explicit effect-property clause**: Effect attributes inferred from + `uses` are not written at the declaration site. SEP-0003 owns how those + attributes are reported by tooling. ## Alternatives considered ### Loop constructs -We considered including a minimal loop construct (e.g., Rust's `loop` or a `foreach`) but rejected it because: +We considered including a minimal loop construct, for example Rust's `loop` or a +`foreach` form, but rejected it because: -- It would undermine the functional-first philosophy. -- TCO guarantee makes recursion safe and efficient. -- HOFs (`map`, `fold`, `filter`) cover the vast majority of iteration patterns. -- A single iteration paradigm reduces cognitive load once learned. +- It would introduce a second iteration style into the core grammar. +- Recursion and library functions provide the single accepted iteration style. +- Higher-order library functions such as `map`, `fold`, and `filter` cover common iteration patterns. +- A single iteration style simplifies formatter, analyzer, and agent behavior. ### Angle brackets for generics (`List`) Rejected because: - Angle brackets create parsing ambiguity with comparison operators (`a < b > c`). -- Square brackets align with Python type hints (`List[int]`) which are increasingly familiar. -- Gleam and other modern languages have validated this choice. +- Square brackets align with Python type hints (`List[int]`). +- The choice matches existing precedent in modern typed languages. ### `with` clause for explicit effect properties -The original design included `with [pure, deterministic]` for explicit property annotation. This was removed because: +The original design included `with [pure, deterministic]` for explicit effect +attribute annotation. This was removed because: -- Properties are fully deterministic given the `uses` set — annotation is redundant. -- Redundant annotations can diverge from reality, creating false confidence. +- Effect attributes are derivable from the `uses` set. +- Redundant annotations can diverge from inferred effect facts. - The compiler can display inferred properties in IDE hover and `--explain` output. ### Significant whitespace (Python/Haskell style) @@ -1881,7 +1466,7 @@ Rejected because: - Invisible control flow violates the principle of explicit effects. - Error sets in signatures enable static analysis and agent reasoning. -- `?` provides comparable ergonomics to exceptions for the happy path. +- `?` keeps the common success path concise while preserving explicit error sets. ## Prior art @@ -1897,7 +1482,7 @@ Influence on: expression-based design, the trait/effect split design, no-loop ph ### Gleam -Influence on: square brackets for generics, use of `|>` pipe operator as a core language feature, the philosophy of a small fixed-operator language that is easy for tools to process. +Influence on: square brackets for generics, use of `|>` pipe operator as a core language feature, and a small fixed-operator language surface. ### Elm @@ -1913,7 +1498,7 @@ Influence on: type classes (→ traits), purity by default, the idea of tracking ### TypeScript / Python -Influence on: f-string syntax (`f"...{expr}..."`), postfix type annotations (`name: Type`), square brackets for generics (Python), the desire for a gentle learning curve. +Influence on: f-string syntax (`f"...{expr}..."`), postfix type annotations (`name: Type`), and square brackets for generics (Python). ### Koka @@ -1921,398 +1506,71 @@ Influence on: algebraic effect system design, the idea that effects should be in ## Backward compatibility and migration -As this is the initial v0.1 specification, there are no backward compatibility concerns. +As this is the initial root syntax specification, there are no backward compatibility concerns. ### Future compatibility considerations 1. **Keyword reservation**: The large reserved keyword set is intentional — it preserves room for future features (e.g., `async`, `unsafe`, `macro`) without breaking existing code. -2. **Signature evolution**: The snapshot hash system means any signature change requires explicit approval. Future SEPs that modify signature clause syntax must define migration tooling. +2. **Signature evolution**: Future SEPs that modify signature clause syntax must + define migration tooling and specify their interaction with signature or + snapshot hashing. -3. **Cost model changes**: The `cost` clause semantics may evolve. Future versions should define a cost model versioning scheme so that cost bounds written today remain meaningful. +3. **Cost model changes**: The `cost` clause semantics may evolve. Future changes should define a migration story so that cost bounds written today remain meaningful. 4. **New expression forms**: The grammar is designed to be extensible — new expression forms can be added as new `PrimaryExpr` alternatives without disturbing existing productions. -5. **Standard library stability**: Type names like `Option`, `Result`, `List`, `Map` are part of the language surface. Renaming these would require a major version bump and automated migration. +5. **Standard library stability**: Type names like `Option`, `Result`, `List`, `Map` are part of the language surface. Renaming these would require an explicit breaking-change migration. ## Unresolved questions -1. **Exact cost model semantics**: What units does `cost` measure? CPU cycles? Abstract "work units"? How does the compiler verify cost bounds, especially for recursive functions? - -2. **Effect composition rules**: When a function calls two sub-functions with different `uses` sets, is the resulting set the union? How are effect aliases expanded for comparison? - -3. **Refinement type checking**: How deeply does the compiler verify refinement predicates (`type Positive = I64 if |n| n > 0`)? Is this compile-time only, or does it generate runtime checks? - -4. **`throw` semantics**: Is `throw` syntactic sugar for `return Err(...)`, or does it have distinct unwinding semantics? The current spec treats it as sugar but this needs formalization. - -5. **Module system details**: How do circular module dependencies behave? What is the exact resolution order for qualified names? - -6. **`self` method dispatch**: How does the compiler resolve method calls on `self` — is it static dispatch (monomorphized) or dynamic dispatch (vtable)? This affects performance characteristics. - -7. **Format string security**: What sanitization, if any, applies to `f"..."` expressions to prevent injection attacks? How do `t"..."` template objects interact with the effect system? - -8. **Tail-call optimization scope**: Is TCO guaranteed only for direct self-recursion, or for mutual recursion and continuation-passing style as well? - -9. **Partial application**: The pipe operator supports `x |> f(_, y)` placeholder syntax. Should Spore support general partial application beyond pipe contexts? - -10. **Pattern matching on strings**: The current spec shows string literal patterns. Should Spore support regex patterns or prefix/suffix matching in `match` arms? - -11. **Standard effect taxonomy evolution**: SEP-0003 now defines the initial built-in effect vocabulary (`Console`, `FileRead`, `FileWrite`, `NetConnect`, `NetListen`, `Env`, `Spawn`, `Clock`, `Random`, `Exit`). Future SEPs may extend it, but changes should preserve the intent-oriented classification. - -12. **Error-set surface aliases**: This slice standardizes canonicalized error-set -equivalence and subset checks without introducing a dedicated `error Alias = ...` -surface. Should a later SEP add user-declared error-set aliases, or should Spore -continue relying on existing nominal type names plus canonicalization? - -13. **`property` with `uses`**: If a property calls a function that has side effects, what effect scope applies? Current proposal: property items inherit the enclosing function's `uses` set. - -14. **Future trait-level behavioral contracts**: Should a later SEP allow `spec` blocks on trait methods, and if so how should implementations inherit or refine them? This amendment leaves trait-level `spec` syntax and semantics unspecified. - -15. **`example` execution order**: Should `example` items run in declaration order (default) or allow parallel execution via `--parallel-spec`? - -16. **Negative examples**: Should `spec` support expressing that a call should fail at compile time (type error)? This would enable testing the type system itself but requires a different mechanism. - -## Appendix A: Small-step operational semantics - -This appendix formalizes the evaluation rules for core Spore expressions using small-step operational semantics. All Spore code evaluates under a **call-by-value** strategy. - -### Notation - -- `e ↝ e'` — expression `e` reduces to `e'` in one step -- `v` — a value (fully evaluated: literal, closure, struct/enum instance, list) -- `E[·]` — evaluation context (position where the next reduction occurs) -- `σ` — runtime environment (variable → value mapping) -- `H` — effect handler table (effect → handler mapping) -- `e[x ↦ v]` — substitute `v` for `x` in `e` - -### Values - -```text -v ::= n (integer literal) - | f (float literal) - | true | false (boolean) - | "s" (string literal) - | 'c' (char literal) - | () (unit) - | S { f₁: v₁, ..., fₙ: vₙ } (struct instance) - | V(v₁, ..., vₙ) (enum variant instance) - | V (zero-field enum variant) - | [v₁, ..., vₙ] (list) - | (closure with captured env) -``` - -### Evaluation contexts - -Evaluation contexts define the order of evaluation (left-to-right, call-by-value): - -```text -E ::= □ (hole) - | E op e₂ (left of binop) - | v₁ op E (right of binop) - | f(v₁, ..., vᵢ₋₁, E, eᵢ₊₁, ..., eₙ) (function argument) - | let x = E; e₂ (let initializer) - | if E { e₁ } else { e₂ } (if condition) - | match E { arms } (match scrutinee) - | S { f₁: v₁, ..., fᵢ: E, ..., fₙ: eₙ } (struct field) - | V(v₁, ..., vᵢ₋₁, E, eᵢ₊₁, ..., eₙ) (enum constructor arg) - | E.field (field access receiver) - | E |> f (pipe left-hand side) - | [v₁, ..., vᵢ₋₁, E, eᵢ₊₁, ..., eₙ] (list element) - | E? (try operator operand) - | spawn E (spawn body — eager eval) -``` - -### Core reduction rules - -**Literals and variables:** - -```text -[E-Var] - σ(x) = v - ────────────── - σ ⊢ x ↝ v - -(Variables resolve to their bound value in the environment.) -``` - -**Arithmetic and comparison:** - -```text -[E-BinOp] - v₁ op v₂ = v₃ (where op is +, -, *, /, %, ==, !=, <, >, <=, >=, &&, ||) - ────────────────── - v₁ op v₂ ↝ v₃ - -[E-StrConcat] - "s₁" + "s₂" ↝ "s₁s₂" - -[E-UnaryNeg] - -n ↝ (-n) - -[E-UnaryNot] - !b ↝ ¬b -``` - -**Let binding:** - -```text -[E-Let] - σ ⊢ let x = v; e₂ ↝ σ[x ↦ v] ⊢ e₂ - -(Bind the value and continue evaluating the body with extended environment.) -``` - -**Function application (β-reduction):** - -```text -[E-App] - σ ⊢ f(v₁, ..., vₙ) - where f is defined as fn f(x₁, ..., xₙ) { body } - ───────────────────────────────────────────────── - σ[x₁ ↦ v₁, ..., xₙ ↦ vₙ] ⊢ body - -[E-ClosureApp] - (v₁, ..., vₙ) - ───────────────────────────────────────────────────── - σ_captured[x₁ ↦ v₁, ..., xₙ ↦ vₙ] ⊢ body - -(Closures evaluate in their captured environment, extended with arguments.) -``` - -**Lambda creation:** - -```text -[E-Lambda] - σ ⊢ |x₁, ..., xₙ| body ↝ - -(Captures the current environment.) -``` - -**Conditionals:** - -```text -[E-IfTrue] - if true { e₁ } else { e₂ } ↝ e₁ - -[E-IfFalse] - if false { e₁ } else { e₂ } ↝ e₂ -``` - -**Pattern matching:** - -```text -[E-Match] - match v { p₁ => e₁, ..., pₙ => eₙ } - where pᵢ is the first pattern matching v with bindings B - ────────────────────────────────────────────────────── - ↝ eᵢ[B] - -(Try patterns top-to-bottom. First match wins. Apply bindings to body.) -``` - -Pattern matching sub-rules: - -```text -[Pat-Wildcard] - _ matches any v, bindings = ∅ - -[Pat-Var] - x matches any v, bindings = {x ↦ v} - -[Pat-Literal] - n matches n, bindings = ∅ (and similarly for string, bool literals) - -[Pat-Constructor] - V(p₁, ..., pₖ) matches V(v₁, ..., vₖ) - if each pᵢ matches vᵢ with bindings Bᵢ - bindings = B₁ ∪ ... ∪ Bₖ - -[Pat-Struct] - S { f₁: p₁, ..., fₖ: pₖ } matches S { ..., fᵢ: vᵢ, ... } - if each pᵢ matches vᵢ with bindings Bᵢ - bindings = B₁ ∪ ... ∪ Bₖ - -[Pat-Or] - (p₁ | p₂) matches v if p₁ matches v or p₂ matches v - -[Pat-List-Empty] - [] matches [] - -[Pat-List-Cons] - [p₁, ..rest] matches [v₁, v₂, ..., vₙ] - if p₁ matches v₁ with bindings B₁ - rest binds to [v₂, ..., vₙ] - bindings = B₁ ∪ {rest ↦ [v₂, ..., vₙ]} - -[Pat-Guard] - p if cond => e - p matches v with bindings B, and B ⊢ cond ↝ true -``` - -**Struct and enum construction:** - -```text -[E-StructLit] - S { f₁: v₁, ..., fₙ: vₙ } ↝ S { f₁: v₁, ..., fₙ: vₙ } - -(Struct literals with all fields evaluated are values.) - -[E-EnumConstruct] - V(v₁, ..., vₙ) ↝ V(v₁, ..., vₙ) - -(Fully evaluated variant constructors are values.) -``` - -**Field access:** - -```text -[E-FieldAccess] - S { ..., f: v, ... }.f ↝ v -``` - -**Pipe operator:** - -```text -[E-Pipe] - v |> f ↝ f(v) - -(Pipe desugars to function application.) -``` - -**Block expressions:** - -```text -[E-Block] - { stmt₁; stmt₂; ...; stmtₙ; tail } - ↝ evaluate stmt₁, then { stmt₂; ...; stmtₙ; tail } - -[E-BlockTail] - { v } ↝ v -``` - -**Try (? operator):** - -```text -[E-TryOk] - Ok(v)? ↝ v - -[E-TryErr] - Err(e)? ↝ throw Err(e) - -(Unwraps Ok, propagates Err to the nearest error boundary.) -``` - -**Throw and error propagation:** - -```text -[E-Throw] - throw v ↝ RuntimeError(v) - -(Propagates up the call stack until caught by a match or error boundary.) -``` - -### Effect dispatch - -```text -[E-ForeignCall] - H(effect, fn_name) = handler - handler(v₁, ..., vₙ) ↝ᵢₒ v - ────────────────────────────────── - foreign_fn(v₁, ..., vₙ) ↝ v - -(Foreign function calls are dispatched to the platform's effect handler table. -The handler may perform real I/O — this is the only impure reduction rule.) -``` - -### Concurrency (structured) - -```text -[E-Spawn] - spawn e ↝ Task(e, fresh_id) - -(Creates a new task. In the PoC interpreter, evaluates synchronously. -In a production runtime, this would fork a lightweight thread.) - -[E-Await] - await Task(v, id) ↝ v - -(Blocks until the task completes and extracts its value. -In PoC, this is a no-op since spawn already evaluated.) - -[E-Select] - select { task₁ => e₁, ..., taskₙ => eₙ } - where taskᵢ is the first to complete with value v - ──────────────────────────────────────────────── - ↝ eᵢ[result ↦ v] -``` - -### Hole evaluation - -```text -[E-Hole] - ?name ↝ RuntimeError("hit unfilled hole `?name`") - -(Holes are compile-time constructs. Reaching one at runtime is always an error.) -``` - -### Spec evaluation - -```text -[E-SpecExample] - spec_example("label", body_expr) - body_expr ↝* true - ────────────────────────────────── - SpecResult("label", Pass) - -[E-SpecExampleFail] - spec_example("label", body_expr) - body_expr ↝* false - ────────────────────────────────── - SpecResult("label", Fail(left_value, right_value)) - -[E-SpecProperty] - spec_property("label", |x₁: T₁, ..., xₙ: Tₙ| body) - ∀ generated (v₁, ..., vₙ): body[x₁ ↦ v₁, ..., xₙ ↦ vₙ] ↝* true - ────────────────────────────────── - SpecResult("label", Pass) - -[E-SpecPropertyCounterexample] - spec_property("label", |x₁: T₁, ..., xₙ: Tₙ| body) - ∃ (v₁, ..., vₙ): body[x₁ ↦ v₁, ..., xₙ ↦ vₙ] ↝* false - ────────────────────────────────── - SpecResult("label", Counterexample(v₁, ..., vₙ)) - -(Spec items are evaluated by `spore test`, not during normal program execution. -Each example/property is compiled as a standalone test case that calls the function -by name. If that function body still contains a hole, the ordinary hole runtime-error -rule applies, so the `spec` remains useful metadata but not a normal runnable path yet.) -``` - -### List operations (builtins) - -```text -[E-ListLen] - len([v₁, ..., vₙ]) ↝ n - -[E-ListMap] - map([v₁, ..., vₙ], f) ↝ [f(v₁), ..., f(vₙ)] - -[E-ListFilter] - filter([v₁, ..., vₙ], p) ↝ [vᵢ | p(vᵢ) = true] - -[E-ListFold] - fold([v₁, ..., vₙ], init, f) ↝ f(...f(f(init, v₁), v₂)..., vₙ) - -[E-ListEach] - each([v₁, ..., vₙ], f) ↝ f(v₁); ...; f(vₙ); () -``` - -### Properties - -1. **Determinism**: All pure reduction rules are deterministic. Non-determinism arises only from `select` (which task completes first) and `Random` effects. - -2. **Type preservation (subject reduction)**: If `Γ; S ⊢ e : T` and `e ↝ e'`, then `Γ; S ⊢ e' : T`. (Proof sketch: by induction on the typing derivation — see SEP-0002 typing judgments.) - -3. **Progress**: If `Γ; S ⊢ e : T` and `e` is not a value, then either `e ↝ e'` for some `e'`, or `e` is a `foreign fn` call awaiting I/O. (Holes are ruled out at type-check time in complete programs.) - -4. **Effect soundness**: If a function has `uses []` (no effects), its evaluation never reaches an `[E-ForeignCall]` rule. (By the effect subset check in SEP-0003.) +SEP-0001 has no unresolved questions that block accepting the root syntax +surface. The following decisions are accepted here because they affect grammar +or signature layout: + +1. **`throw`** is a core expression form. SEP-0001 owns only its spelling and + placement; SEP-0002 owns its error-channel typing and lowering. + +2. **Placeholder partial application** is a general call-position form: + `f(_, y)` creates a lambda over the placeholder argument. Pipe use such as + `x |> f(_, y)` is a direct application of that same rule. + +3. **Interpolation boundaries** are fixed for this accepted surface: both `f"..."` and `t"..."` + embed Spore expressions. SEP-0009 owns template rendering and sanitization. + +4. **String patterns** are literal patterns only. Regex, prefix, suffix, or + other string-pattern forms require a future syntax SEP. + +5. **Behavioral `spec` blocks** are accepted on ordinary function declarations, + trait method signatures, and `impl` methods. Trait-method contract + inheritance or merging is not part of SEP-0001. + +6. **Negative examples** are not part of the accepted `spec` grammar. The only + accepted `spec` item keywords are `example` and `law`. + +### Delegated to dependent SEPs + +- **Type and refinement semantics**: SEP-0002 owns primitive type meaning, + type inference, refinement checking, method dispatch, and error-set + canonicalization. +- **Effect composition and taxonomy**: SEP-0003 owns effect-set union, + alias expansion, handler discharge, and future effect vocabulary changes. +- **Cost units and verification**: SEP-0004 owns cost dimensions, units, + recursion analysis, and verification algorithms. +- **Hole reporting and partial-function status**: SEP-0005 owns HoleReport + fields, dependency graphs, and agent workflow. +- **Compiler/runtime guarantees**: SEP-0006 owns formatter enforcement, + diagnostics, lowering, and runtime implementation details. +- **Concurrency behavior**: SEP-0007 owns `spawn`, `await`, `select`, task + lifetime, cancellation, and channel semantics. +- **Module resolution**: SEP-0008 owns circular dependencies, import resolution, + package manifests, and content-addressed hashing. +- **Standard-library names and behavior**: SEP-0009 owns prelude items, + collection APIs, string helpers, and platform-facing library modules. + +## Appendix A: Delegated semantics + +SEP-0001 intentionally does not define operational semantics. Evaluation order, +type preservation, effect soundness, hole runtime behavior, `spec` execution, +standard-library operations, and concurrency runtime rules are delegated to +SEP-0002 through SEP-0009. diff --git a/seps/SEP-0002-type-system.md b/seps/SEP-0002-type-system.md index 3b63052..0fe361a 100644 --- a/seps/SEP-0002-type-system.md +++ b/seps/SEP-0002-type-system.md @@ -21,9 +21,7 @@ superseded_by: null This SEP specifies the type system for the Spore programming language. Spore's type system is a **nominal-primary, bidirectionally-inferred** system that sits between Rust and Haskell on the expressiveness spectrum — more targeted than full dependent types, yet richer than a Hindley–Milner core alone. -The concrete type representation is the `Ty` enum: - -Implementation note: **`EffectSet`** stores effect names as strings in current compiler internals (`BTreeSet` in Rust). Surface syntax references those names as literal identifiers. +The concrete type representation is defined by the following type grammar: ```text Ty ::= I8 | I16 | I32 | I64 | U8 | U16 | U32 | U64 | F32 | F64 @@ -39,7 +37,7 @@ Ty ::= I8 | I16 | I32 | I64 | U8 | U16 | U32 | U64 | F32 | F64 | Error ``` -**Reference compiler:** This matches the `Ty` enum shipped in `sporec-typeck` (see `crates/sporec-typeck/src/types.rs`): fixed-width numerics, no `Char`, plus `Tuple` and `Refined`. Single-quoted character literals are rejected in the lexer (**no `Char` type or literals**, `spore` PR #113). **Unsuffixed integer literals infer as `I64` and float literals as `F64`** unless a context forces another fixed width. `Int` and `Float` are not primitive names or informal aliases. +Where **`EffectSet`** is a set of effect-name identifiers (the surface syntax references those names as literal identifiers), and **`ErrorSet`** is a closed set of error types. Unsuffixed integer literals default to **`I64`** and float literals to **`F64`** unless a context forces another fixed width. `Int` and `Float` are not primitive names or informal aliases. There is no `Char` type or character literal; single Unicode scalars are represented as `Str` values. **Platform-specific integers (e.g. process exit status).** The core type system does **not** fix a single machine type for exit codes or other host ABI surfaces. Those contracts live in the **Platform package** metadata and startup/exit specifications (see SEP-0008); this SEP only requires that the Spore signature matches whatever that Platform declares. @@ -48,16 +46,12 @@ Ty ::= I8 | I16 | I32 | I64 | U8 | U16 | U32 | U64 | F32 | F64 Key design decisions: - **Signatures are gravity centers** — function signatures must be richly typed and fully explicit; bodies are inferred. -- **EffectSet** is encoded directly in function types (see implementation note above), making effect tracking a first-class part of the type. -- **Current implementation note**: the checker currently carries error sets with a - similarly plain set-backed internal representation. This wave specifies the - target canonicalization-first semantics layered on top of that representation, - without requiring a new dedicated `error` declaration surface. +- **EffectSet** is encoded directly in function types, making effect tracking a first-class part of the type. - **Bidirectional type inference** — top-down checking from signatures, bottom-up synthesis from expressions. - **Two-pass module checking** — `register_item` (collect all signatures) then `check_fn` (verify bodies). - **Generics via square-bracket application** — `List[I64]`, `Result[T, E]`. - **Type unification** with `Var` binding and substitution maps. -- **Refinement types** — L0 decidable predicates ship in `sporec-typeck` (`Ty::Refined`); richer L1 abstract interpretation without an SMT solver remains future work. +- **Refinement types** — L0 decidable predicates are specified in §4.11; L1 abstract interpretation (flow-sensitive narrowing) is specified as a future extension without requiring an SMT solver. ## Motivation @@ -76,19 +70,19 @@ Without a specification: - Contributors cannot reason about whether a type-checking change is correct. - Agent-generated code has no formal target to satisfy. -- Diagnostics cannot explain *why* a type error exists — only *that* one exists. +- Diagnostics cannot explain _why_ a type error exists — only _that_ one exists. - Future features (refinements, traits, const generics) have no foundation to build on. ### Design philosophy -| Principle | Implication | -|---|---| -| **Signatures are gravity centers** | Signatures must be richly typed and explicit; bodies are inferred | -| **Nominal-primary, structural escape** | Named types are nominal; anonymous records are structural; traits and effects are always nominal | -| **Traits ≠ Effects** | Separate abstractions — `trait` defines type interfaces, `effect` defines external operations; they are not unified | -| **Decidable checking** | No SMT solver; refinements limited to decidable predicates and abstract interpretation | -| **Agent-friendly** | Richer types help Agents more than they hurt humans; Agents absorb annotation complexity | -| **Predictable** | No implicit conversions, no type-level computation, no SFINAE-style surprises | +| Principle | Implication | +| -------------------------------------- | ------------------------------------------------------------------------------------------------------------------- | +| **Signatures are gravity centers** | Signatures must be richly typed and explicit; bodies are inferred | +| **Nominal-primary, structural escape** | Named types are nominal; anonymous records are structural; traits and effects are always nominal | +| **Traits ≠ Effects** | Separate abstractions — `trait` defines type interfaces, `effect` defines external operations; they are not unified | +| **Decidable checking** | No SMT solver; refinements limited to decidable predicates and abstract interpretation | +| **Agent-friendly** | Richer types help Agents more than they hurt humans; Agents absorb annotation complexity | +| **Predictable** | No implicit conversions, no type-level computation, no SFINAE-style surprises | ## Guide-level explanation @@ -98,15 +92,15 @@ This section introduces Spore's type system from a user's perspective, with code Spore provides a small, fixed set of built-in **numeric widths**, text, and logical primitives: -| Type | Description | Notes | -|---|---|---| -| `I8`, `I16`, `I32`, `I64` | Signed integers | Unsuffixed literals default to **`I64`** in `sporec-typeck` | -| `U8`, `U16`, `U32`, `U64` | Unsigned integers | | -| `F32`, `F64` | IEEE-754 binary floats | Literals default to `F64` in the reference checker | -| `Bool` | Boolean | `true`, `false` | -| `Str` | UTF-8 string | Use `"x"` even for a single Unicode scalar; there is **no** `Char` type | -| `Unit` | Zero-information type (like Rust `()`) | `()` | -| `Never` | Bottom type — uninhabited, no values exist | (no literal) | +| Type | Description | Notes | +| ------------------------- | ------------------------------------------ | ----------------------------------------------------------------------- | +| `I8`, `I16`, `I32`, `I64` | Signed integers | Unsuffixed literals default to **`I64`** | +| `U8`, `U16`, `U32`, `U64` | Unsigned integers | | +| `F32`, `F64` | IEEE-754 binary floats | Unsuffixed literals default to **`F64`** | +| `Bool` | Boolean | `true`, `false` | +| `Str` | UTF-8 string | Use `"x"` even for a single Unicode scalar; there is **no** `Char` type | +| `Unit` | Zero-information type | `()` | +| `Never` | Bottom type — uninhabited, no values exist | (no literal) | ```spore let x: I64 = 42 // default integer literal type (I64) @@ -119,7 +113,7 @@ let u: () = () // unit (zero-information type) Distinct numeric widths are **not** interchangeable without an explicit conversion story (still evolving). `Str` is unrelated to any numeric width. -**Narrower domains via refinement.** Bounds and lightweight predicates attach to a fixed-width base type (the reference compiler supports this for aliases such as `alias Port = I64 when …`): +**Narrower domains via refinement.** Bounds and lightweight predicates attach to a fixed-width base type: ```spore alias NonNegativeI64 = I64 when self >= 0 @@ -248,6 +242,12 @@ uses [FileWrite] } ``` +### 3.5.1 Nested statement `fn` vs lambdas + +Statement-form nested `fn` declarations (SEP-0001 grammar: `Statement` includes `FunctionDecl` labeled _local function_) introduce ordinary nested functions scoped to the enclosing block. Their bodies **do not** close over enclosing `let` bindings or the enclosing function's parameters (type parameters inherited from enclosing generic headers remain in scope). All value state threaded into helpers must appear as explicit parameters. + +Lambda expressions (`|params|`) are block-level expressions that may capture bindings from the enclosing value environment (see the `[Lambda]` judgment below). + ### 3.6 Generics Generic functions declare type parameters in square brackets and constrain them in `where` clauses: @@ -451,7 +451,7 @@ trait Serialize { fn serialize(self) -> Bytes ! SerializeError } -effect HttpClient = NetConnect | Clock +effect HttpClient = NetConnect | Clock; fn fetch_page(url: Url) -> Response ! NetworkError uses [HttpClient] @@ -537,18 +537,18 @@ GATs cover the most common use cases that HKTs would serve without introducing f Spore provides compiler-known traits for common operations: -| Trait | Purpose | Derivable | -|---|---|---| -| `Eq` | Equality comparison | Yes | -| `Ord` | Ordering | Yes | -| `Clone` | Value duplication | Yes | -| `Display` | Human-readable formatting | No | -| `Debug` | Debug formatting | Yes | -| `Hash` | Hash computation | Yes | -| `Default` | Default value construction | Yes | -| `Serialize` | Serialization to bytes | Yes | -| `Deserialize` | Deserialization from bytes | Yes | -| `Add`, `Sub`, `Mul`, `Div` | Arithmetic operators | No | +| Trait | Purpose | Derivable | +| -------------------------- | -------------------------- | --------- | +| `Eq` | Equality comparison | Yes | +| `Ord` | Ordering | Yes | +| `Clone` | Value duplication | Yes | +| `Display` | Human-readable formatting | No | +| `Debug` | Debug formatting | Yes | +| `Hash` | Hash computation | Yes | +| `Default` | Default value construction | Yes | +| `Serialize` | Serialization to bytes | Yes | +| `Deserialize` | Deserialization from bytes | Yes | +| `Add`, `Sub`, `Mul`, `Div` | Arithmetic operators | No | Derivable traits can be auto-implemented by the compiler for types whose fields all implement the trait: @@ -910,11 +910,7 @@ checked with **canonicalization-first semantics**. Error types are nominal ADTs; the checker resolves each written item to a canonical identity, removes duplicates, and compares sets using that canonical form across call chains. -Current implementation note: the shipping checker already supports the -`! E1 | E2` surface and stores propagated error requirements in a plain set-like -internal form. This section defines the **target behavior of this wave**: -canonical subset/equivalence checks, redundancy diagnostics, and conservative -hashing. It does **not** require a new top-level `error Alias = ...` syntax. +This section defines the canonicalized error-set semantics: subset/equivalence checks, redundancy diagnostics, and conservative hashing. ```text canonicalize(E_written): @@ -1012,32 +1008,32 @@ struct Tree[T] { ### 3.16 Nominal vs structural rules -| Context | Typing Discipline | Rationale | -|---|---|---| -| Named types (`struct` / `type` with variants) | **Nominal** | `UserId ≠ Str` even if same representation | -| Enums | **Nominal** | Sealed, exhaustiveness-checked | -| Traits / effects | **Nominal** | Explicit `impl` required | -| Effects | **Nominal** (always) | Security boundary — no structural coincidence | -| Anonymous records `{ ... }` | **Structural** | Flexibility for intermediate values | -| Hole-filling search (internal) | **Structural** | Agent searches by shape, compiler enforces nominal at boundaries | -| Function call boundaries | **Nominal** | Callee specifies named types, caller must provide them | +| Context | Typing Discipline | Rationale | +| --------------------------------------------- | -------------------- | ---------------------------------------------------------------- | +| Named types (`struct` / `type` with variants) | **Nominal** | `UserId ≠ Str` even if same representation | +| Enums | **Nominal** | Sealed, exhaustiveness-checked | +| Traits / effects | **Nominal** | Explicit `impl` required | +| Effects | **Nominal** (always) | Security boundary — no structural coincidence | +| Anonymous records `{ ... }` | **Structural** | Flexibility for intermediate values | +| Hole-filling search (internal) | **Structural** | Agent searches by shape, compiler enforces nominal at boundaries | +| Function call boundaries | **Nominal** | Callee specifies named types, caller must provide them | ### 3.17 Type inference annotation requirements -| Element | Must Annotate? | Why | -|---|---|---| -| Function parameter types | **Yes** | Gravity center — the signature IS the API | -| Function return type | **Yes** | Agent reads signatures for synthesis; human reads for understanding | -| Error sets (`! ...`) | **Yes** | Error contract — must be visible | -| Effect sets (`uses [...]`) | **Yes** | Security boundary — must be visible | -| Cost clause (`cost [c, a, i, p]`) | **Yes** | Performance contract — must be visible | -| Struct/type field types | **Yes** | Data definition — must be explicit | -| Trait method signatures | **Yes** | Interface contract | -| Public constants | **Yes** | API surface | -| Local variable types | No | Inferred from RHS: `let x = compute(...)` | -| Generic type params at call sites | No | Inferred from arguments: `sort(my_list)` — T inferred from `my_list` | -| Closure parameter types (in context) | No | Inferred: `xs.map(\|x\| x + 1)` — x inferred from the receiver item type | -| Intermediate expression types | No | Standard local inference | +| Element | Must Annotate? | Why | +| ------------------------------------ | -------------- | ------------------------------------------------------------------------ | +| Function parameter types | **Yes** | Gravity center — the signature IS the API | +| Function return type | **Yes** | Agent reads signatures for synthesis; human reads for understanding | +| Error sets (`! ...`) | **Yes** | Error contract — must be visible | +| Effect sets (`uses [...]`) | **Yes** | Security boundary — must be visible | +| Cost clause (`cost [c, a, i, p]`) | **Yes** | Performance contract — must be visible | +| Struct/type field types | **Yes** | Data definition — must be explicit | +| Trait method signatures | **Yes** | Interface contract | +| Public constants | **Yes** | API surface | +| Local variable types | No | Inferred from RHS: `let x = compute(...)` | +| Generic type params at call sites | No | Inferred from arguments: `sort(my_list)` — T inferred from `my_list` | +| Closure parameter types (in context) | No | Inferred: `xs.map(\|x\| x + 1)` — x inferred from the receiver item type | +| Intermediate expression types | No | Standard local inference | ### 3.18 Typed holes @@ -1064,9 +1060,25 @@ Hole ?h2: expected type Bool ## Reference-level explanation +### Notation + +| Symbol | Meaning | Used in | +| ------ | ------------------------------------------------- | -------------------------- | +| `τ` | Type | Type grammar, judgments | +| `σ` | Parameter type (a `τ` used in parameter position) | Function signatures | +| `Γ` | Type environment (variable → type mapping) | Typing judgments | +| `⊢` | Judgment turnstile ("entails") | Typing rules | +| `⇒` | Type synthesis (infer type bottom-up) | Bidirectional typing | +| `⇐` | Type checking (check against expected type) | Bidirectional typing | +| `<:` | Subtype relation | Subtyping rules | +| `C` | EffectSet (set of effect names) | Function types | +| `E` | ErrorSet (set of error types) | Function types, `!` clause | +| `ι` | Integer width metavariable ∈ {I8, …, U64} | Operator typing | +| `φ` | Float width metavariable ∈ {F32, F64} | Operator typing | + ### 4.1 Type grammar -The internal type representation is the `Ty` enum, defined in `crates/sporec-typeck/src/types.rs`: +The internal type representation follows this grammar: ```text τ ::= I8 | I16 | I32 | I64 | U8 | U16 | U32 | U64 | F32 | F64 @@ -1088,26 +1100,22 @@ The internal type representation is the `Ty` enum, defined in `crates/sporec-typ Where: - `n` ranges over identifiers (strings). -- `id` ranges over `u32` (unique unification variable IDs). -- `C` is an **EffectSet** = `BTreeSet`, the set of effect names the function requires. -- `E` is an **ErrorSet**, the closed union of error types declared with `!`. -- `f` ranges over field names (strings) in anonymous records. +- `id` ranges over unique unification variable IDs. +- `C` is an **EffectSet** — the set of effect names the function requires. +- `E` is an **ErrorSet** — the closed union of error types declared with `!`. +- `f` ranges over field names in anonymous records. -**Never type.** `Ty::Never` is the bottom type — a subtype of all types. It is produced by diverging expressions (`panic`, non-terminating recursion). During unification, `unify(τ, Never) = ok` for all `τ`. +**Never type.** `Never` is the bottom type — a subtype of all types. It is produced by diverging expressions (`panic`, non-terminating recursion). During unification, `unify(τ, Never) = ok` for all `τ`. -**Record type.** `Ty::Record` represents anonymous structural records. Width subtyping applies: `Record([(a, I64), (b, Str), (c, Bool)])` is a subtype of `Record([(a, I64), (b, Str)])`. +**Record type.** `Record` represents anonymous structural records. Width subtyping applies: `Record([(a, I64), (b, Str), (c, Bool)])` is a subtype of `Record([(a, I64), (b, Str)])`. -**Error sentinel.** `Ty::Error` is produced when type resolution fails (e.g., unknown type name). It unifies with anything, allowing the checker to continue after errors and report multiple diagnostics. +**Error sentinel.** `Error` is produced when type resolution fails (e.g., unknown type name). It unifies with anything, allowing the checker to continue after errors and report multiple diagnostics. -**Hole type.** `Ty::Hole(name)` represents an unfilled hole. During unification, holes are compatible with anything — the checker records the expected type for diagnostic purposes without blocking further checking. +**Hole type.** `Hole(name)` represents an unfilled hole. During unification, holes are compatible with anything — the checker records the expected type for diagnostic purposes without blocking further checking. ### 4.2 Effect sets (EffectSet) -```rust -pub type EffectSet = BTreeSet; -``` - -An `EffectSet` is a sorted set of effect names. It appears as the third component of `Ty::Fn`; the fourth component is the function's `ErrorSet`: +An `EffectSet` is a sorted set of effect-name identifiers. It appears as the third component of a function type; the fourth component is the function's `ErrorSet`: ```text Fn([τ₁, …, τₙ], τᵣ, { cap₁, cap₂, … }, { err₁, err₂, … }) @@ -1352,43 +1360,17 @@ unify(τᵣ, τ_body) Γ ⊢ f ok ``` -This matches the implementation in `check_fn`: - -```rust -fn check_fn(&mut self, f: &FnDef) { - // bind parameters - for param in &f.params { - let ty = self.resolve_type(¶m.ty); - self.env.define(param.name.clone(), ty); - } - // synthesize body type - let body_ty = self.check_expr(body); - let body_ty = self.apply_subst(&body_ty); - let declared_ret = self.apply_subst(&declared_ret); - // unify return type with body type - self.unify(&declared_ret, &body_ty, &format!("function `{}`", f.name)); -} -``` +The body checking algorithm unifies the synthesized body type with the declared return type, ensuring the body satisfies the signature contract. ### 4.4 Generics — type variables, substitution, and unification #### Type variables -A fresh type variable is created with a unique `u32` ID: - -```rust -fn fresh_var(&mut self) -> Ty { - let id = self.next_var_id; - self.next_var_id += 1; - Ty::Var(id) -} -``` - -When a generic function is called, each type parameter is replaced by a fresh `Ty::Var`. The unifier then resolves these variables by matching against actual argument types. +A fresh type variable carries a unique identifier. When a generic function is called, each type parameter is replaced by a fresh type variable. The unifier then resolves these variables by matching against actual argument types. #### Substitution -A substitution `σ : u32 → Ty` is a partial map from variable IDs to resolved types. Applying a substitution walks the type recursively: +A substitution is a partial map from variable IDs to resolved types. Applying a substitution walks the type recursively: ```text apply_subst(Var(id)) = σ(id) if id ∈ dom(σ) @@ -1399,7 +1381,7 @@ apply_subst(Record(fields)) = Record([(f, apply_subst(τ)) for (f, τ) in apply_subst(τ) = τ for all other τ ``` -Note that `EffectSet` is **not** subject to substitution — effects are always concrete strings, never type variables. +Note that `EffectSet` is **not** subject to substitution — effects are always concrete identifiers, never type variables. #### Unification algorithm @@ -1446,7 +1428,7 @@ pub fn check_module(&mut self, module: &Module) { } ``` -**Pass 1 — `register_item`:** Iterates over all top-level items (functions, structs, type definitions) and registers their signatures in the `TypeRegistry`: +Pass 1 — Signature registration: Iterates over all top-level items (functions, structs, type definitions) and registers their signatures: - **Functions:** Parameter types are resolved, return type is resolved (default `Unit`), EffectSet is extracted from `uses` clause, and type parameters from `where` clause are recorded. - **Structs:** Field names and types are resolved and stored. @@ -1467,7 +1449,7 @@ This two-pass design allows **forward references** — a function can call anoth ### 4.6 Struct types -Struct types are registered as a list of `(field_name, Ty)` pairs. Field access is checked by looking up the struct name in the registry and finding the named field: +Struct lookup resolves the struct name in the type registry and looks up the named field: ```text Γ ⊢ e ⇒ Named(S) @@ -1517,26 +1499,13 @@ Record(fields₁) <: Record(fields₂) ### 4.8 Function types -Function types are `Ty::Fn(params, ret, EffectSet, ErrorSet)`: +Function types are `Fn(params, ret, EffectSet, ErrorSet)`: ```text Fn([σ₁, …, σₙ], τᵣ, C, E) ``` -A function value is a first-class value. Looking up a function name in the registry when it appears as a bare expression produces its function type: - -```rust -Expr::Var(name) => { - if let Some(ty) = self.env.lookup(name) { - ty.clone() - } else if let Some((params, ret, caps, errs)) = self.registry.functions.get(name) { - Ty::Fn(params.clone(), Box::new(ret.clone()), caps.clone(), errs.clone()) - } else { - self.err(format!("undefined variable `{name}`")); - Ty::Error - } -} -``` +A function value is a first-class value. Looking up a function name when it appears as a bare expression produces its function type from the type registry. **Display format for function types:** @@ -1600,7 +1569,7 @@ Spore primarily uses **equality-based** type compatibility via unification, with Γ ⊢ e₁ ⊗ e₂ ⇒ Bool ``` -(Relational operators require comparable operands; the reference checker unifies the two operand types. For documentation-only emphasis, **numeric** comparisons are exactly the instances where `τ = ι` or `τ = φ`.) +(Relational operators require comparable operands; both operand types must unify. For documentation-only emphasis, **numeric** comparisons are exactly the instances where `τ = ι` or `τ = φ`.) **Boolean operators** (`&&`, `||`): @@ -1705,7 +1674,7 @@ L1 explicitly excludes: arbitrary quantifiers, aliasing analysis, heap reasoning #### Trait registry -Traits are registered in a `TraitRegistry` alongside the `TypeRegistry`. Each trait records: +Traits are registered alongside types. Each trait records: - Trait name - Supertrait requirements (e.g., `Ord: Eq`) @@ -1851,7 +1820,10 @@ struct B { field: A } // OK: List provides indirection - **Clear error messages.** Because types are nominal and simple, error messages can say "expected `Celsius`, got `Fahrenheit`" rather than showing structural type dumps. - **Minimal annotation burden.** Only function signatures require full annotation; local variables and closures are inferred. Humans write types where they matter (API boundaries) and skip them where they don't (implementation details). - **Predictable behavior.** No implicit conversions, no SFINAE, no surprising type-level computation. If it compiles, the types mean what they say. -- **Refinement types (future) catch bugs early.** `Port` being `I64 if 1 <= self <= 65535` means invalid values are caught at compile time, not at runtime. +- **Refinement types catch bugs early.** `Port` being + `I64 when self >= 1 && self <= 65535` means invalid values are caught by the + checker or by explicit validation paths rather than being left as untyped + integers. ### Negative @@ -1875,9 +1847,9 @@ Rich type signatures give Agents strong guidance for code generation: ### Structured type information -The `Ty` enum and `TypeRegistry` are designed for programmatic access: +The type system's type registry is designed for programmatic access: -- `registry.functions` maps function names to `(param_types, return_type, cap_set)`. +- `registry.functions` maps function names to `(param_types, return_type, effect_set)`. - `registry.structs` maps struct names to field lists. - `registry.types` maps enum names to variant lists. @@ -1897,24 +1869,24 @@ The checker feeds the shared typed-hole protocol defined in SEP-0005. In practic All types implement `Display` with a canonical format: -| Type | Display | -|---|---| -| `Ty::I32` (etc.) | `I32`, `I64`, … | -| `Ty::F64` (etc.) | `F64`, `F32`, … | -| `Ty::Bool` | `Bool` | -| `Ty::Str` | `Str` | -| `Ty::Unit` | `()` | -| `Ty::Never` | `Never` | -| `Ty::Tuple([τ₁, τ₂])` | `(...)` / tuple spelling used by printer | -| `Ty::Refined(τ, …)` | surface refinement form | -| `Ty::Named("Foo")` | `Foo` | -| `Ty::App("List", [I64])` | `List[I64]` | -| `Ty::Record([(x, I64), (y, F64)])` | `{ x: I64, y: F64 }` | -| `Ty::Fn([I64], Bool, {}, {})` | `(I64) -> Bool` | -| `Ty::Fn([I64], Bool, {"Net"}, {})` | `(I64) -> Bool uses [Net]` | -| `Ty::Var(3)` | `?T3` | -| `Ty::Hole("h1")` | `?h1` | -| `Ty::Error` | `` | +| Type | Display | +| --------------------------------------- | ---------------------------------------- | +| `I32` (etc.) | `I32`, `I64`, … | +| `F64` (etc.) | `F64`, `F32`, … | +| `Bool` | `Bool` | +| `Str` | `Str` | +| `Unit` | `()` | +| `Never` | `Never` | +| `Tuple([τ₁, τ₂])` | `(...)` / tuple spelling used by printer | +| `Refined(τ, …)` | surface refinement form | +| `Named("Foo")` | `Foo` | +| `App("List", [I64])` | `List[I64]` | +| `Record([(x, I64), (y, F64)])` | `{ x: I64, y: F64 }` | +| `Fn([I64], Bool)`, effect set empty | `(I64) -> Bool` | +| `Fn([I64], Bool)`, effect set non-empty | `(I64) -> Bool uses [Net]` | +| `Var(3)` | `?T3` | +| `Hole("h1")` | `?h1` | +| `Error` | `` | ### Snapshot hashes @@ -1950,18 +1922,18 @@ Type diagnostics participate in the shared diagnostics protocol described in SEP ### Error categories produced by the type checker -| Category | Example message | -|---|---| -| Type mismatch | `type mismatch in function 'add': expected 'I64', got 'Str'` | -| Undefined variable | `undefined variable 'x'` | -| Arity mismatch | `function 'add' expects 2 arguments, got 3` | -| Missing effect | `missing effects [NetConnect]: caller does not declare them` | -| Missing propagated error | `function may raise [ParseError] which is not in the declared error set` | -| Redundant error item | `redundant error item 'ParseFailure': already covered by 'config.errors.ParseError'` | -| Non-exhaustive match | `non-exhaustive match: missing variant 'Triangle'` | -| Cannot negate type | `cannot negate type 'Str'` | -| Cannot apply `!` | `cannot apply '!' to type 'I64'` | -| Unknown field | `struct 'Point' has no field 'z'` | +| Category | Example message | +| ------------------------ | ------------------------------------------------------------------------------------ | +| Type mismatch | `type mismatch in function 'add': expected 'I64', got 'Str'` | +| Undefined variable | `undefined variable 'x'` | +| Arity mismatch | `function 'add' expects 2 arguments, got 3` | +| Missing effect | `missing effects [NetConnect]: caller does not declare them` | +| Missing propagated error | `function may raise [ParseError] which is not in the declared error set` | +| Redundant error item | `redundant error item 'ParseFailure': already covered by 'config.errors.ParseError'` | +| Non-exhaustive match | `non-exhaustive match: missing variant 'Triangle'` | +| Cannot negate type | `cannot negate type 'Str'` | +| Cannot apply `!` | `cannot apply '!' to type 'I64'` | +| Unknown field | `struct 'Point' has no field 'z'` | ### Dual-channel diagnostics @@ -1987,11 +1959,11 @@ Hole ?h1 in function `example`: 1. **Nominal rigidity.** The nominal-primary design means newtypes require explicit wrap/unwrap. This adds verbosity for simple delegation patterns. Anonymous records provide a structural escape hatch, but they cannot implement traits. -2. **EffectSet as BTreeSet\.** String-based effect names lack static verification at the `Ty` level — a misspelled effect name is only caught when matching against registered effects, not during type construction. +2. **EffectSet representation.** Effect names are identifiers without separate static verification — a misspelled effect name is caught when matching against registered effects, not during type construction. 3. **No higher-kinded types.** GATs + associated types cover most use cases, but abstracting over container kinds generically (functor-map over any container) requires either code generation or per-container implementations. -4. **Refinement types are only partially implemented.** L0-style aliases (`alias Port = I64 when …`) work in the reference compiler, but the full refinement vision in later sections (flow-sensitive narrowing, transitive proof obligations) is not complete. +4. **Refinement types are not yet fully specified.** L0-style aliases (`alias Port = I64 when …`) are specified here, but the full refinement vision in §4.11 (flow-sensitive narrowing, transitive proof obligations) is planned as a future extension. 5. **Annotation overhead.** Requiring full function signatures is more annotation than TypeScript or Python. This is a deliberate trade-off — signatures are the gravity center for both humans and Agents. @@ -2052,7 +2024,7 @@ Hole ?h1 in function `example`: fn apply[C](f: Fn(x: I64) -> I64 uses C, x: I64) -> I64 uses C ``` -This is compatible with the current design but not yet implemented. The string-based `BTreeSet` representation would need to be extended with effect variables. +This is compatible with the current design. Effect-set representation would need to be extended with effect variables. ## Prior art @@ -2099,13 +2071,13 @@ Spore implements a simplified version: no higher-rank polymorphism, no impredica ### This is the initial type system specification -Since this is the first formal type system specification (implemented by `sporec-typeck`), there is no older SEP to be backward-compatible with. The `Ty` enum in the reference compiler is the de facto baseline documented in §§3.1–4.1 above. +Since this is the first formal type system specification, there is no older SEP to be backward-compatible with. The type grammar in §4.1 is the normative baseline. ### Compatibility commitments -1. **Ty enum stability.** The variants `I8`…`U64`, `F32`, `F64`, `Bool`, `Str`, `Unit`, `Never`, `Tuple`, `Refined`, `Named`, `App`, `Record`, `Fn`, `Var`, `Hole`, `Error` are stable in the reference implementation. New variants may be added but existing variants will not be removed or renamed without a new SEP. (`Char` was removed from the language and toolchain in `spore` PR #113.) +1. **Type grammar stability.** The type constructors `I8`…`U64`, `F32`, `F64`, `Bool`, `Str`, `Unit`, `Never`, `Tuple`, `Refined`, `Named`, `App`, `Record`, `Fn`, `Var`, `Hole`, `Error` are stable. New constructors may be added but existing ones will not be removed or renamed without a new SEP. -2. **EffectSet representation.** `BTreeSet` is the current representation. Future SEPs may introduce effect variables for row-polymorphic effects, but the concrete string-based API will remain supported. +2. **EffectSet representation.** The identifier-based representation is stable. Future SEPs may introduce effect variables for row-polymorphic effects, but the concrete identifier-based API will remain supported. 3. **Two-pass checking.** The `register_item` → `check_fn` architecture is stable. Future passes (e.g., trait resolution, cost checking) will be added after these two, not replace them. @@ -2113,38 +2085,54 @@ Since this is the first formal type system specification (implemented by `sporec ### Migration path for future features -| Feature | Migration strategy | -|---|---| -| Refinement types (L0) | `Ty::Refined` exists in `sporec-typeck`; extend semantics/predicate classes as needed | -| Index generics | Add an `Index` kind and extend `App` to accept Index arguments | +| Feature | Migration strategy | +| ----------------------- | ----------------------------------------------------------------- | +| Refinement types (L0) | Extend semantics/predicate classes as needed | +| Index generics | Add an `Index` kind and extend `App` to accept Index arguments | | Row-polymorphic effects | Extend EffectSet with effect variables alongside concrete strings | ## Unresolved questions -1. **EffectSet in unification.** Currently, function type unification ignores EffectSets and ErrorSets (the `_` placeholders in `Fn(p1, r1, _, _), Fn(p2, r2, _, _)`). Should those sets participate in unification, or remain checked separately? Separate checking is simpler but could miss higher-order mismatches unless call-site validation stays strict. - -2. **Generic syntax: square brackets vs angle brackets.** The implemented `Ty::App` uses parenthesized syntax in the AST (`List[I64]`), but some spec examples use angle brackets (`List`). This SEP uses square brackets as the intended syntax; a separate SEP should finalize the concrete syntax. +1. **EffectSet and ErrorSet unification.** Currently, function type unification + treats effect and error sets as separately checked components. A future + type-system revision must decide whether higher-order function unification + should include those sets directly or keep the current call-site validation + split. -3. **Refinement type representation.** Where do refinement predicates live in the `Ty` enum? Options: - - (a) `Ty::Refined(Box, Predicate)` — a wrapper around any base type. - - (b) Store refinements in the `TypeRegistry` alongside the base type, not in `Ty` itself. - - (c) Treat refined types as named types (`Port = Named("Port")`) with predicates stored separately. +2. **Trait and impl checker rollout.** §4.12 specifies the semantic direction + for trait method signatures, default methods, and `impl` blocks. The + remaining work is to align the checker phases with that model + without changing the SEP-0001 surface grammar. -4. **Trait method type checking.** The two-pass architecture registers function signatures but does not yet handle trait method signatures, default methods, or `impl` blocks. How should these be integrated into `register_item` and `check_fn`? (See §4.12 for the specified approach.) +3. **Row-polymorphic effects.** Should effect sets support variables (for + example `uses [C]`) so higher-order functions can preserve their argument's + effect requirements? This is a possible advanced extension layered on top of + the flat effect-set model in SEP-0003. -5. **Row-polymorphic effects.** Should effect sets support variables (e.g., `uses [C]` where `C` is an effect set variable)? This would enable generic higher-order functions that preserve their argument's effect requirements. - -6. **Variance.** Generic type parameters need variance annotations (covariant, contravariant, invariant) for soundness when subtyping is introduced. Should variance be inferred or declared? +4. **Variance.** Generic type parameters need variance annotations or inference + for soundness if broader subtyping is introduced. This remains tied to the + future subtyping story. ### Resolved questions The following questions from earlier drafts are now resolved by this specification: -1. **Occurs check.** ✅ Resolved — the unifier includes an occurs check (§4.4). Cyclic substitutions like `Var(0) ↦ List[Var(0)]` are rejected. +1. **Occurs check.** Resolved — the unifier includes an occurs check (§4.4). + Cyclic substitutions like `Var(0) ↦ List[Var(0)]` are rejected. + +2. **Never type semantics.** Resolved — `unify(τ, Never) = ok` for all `τ` + (§4.16). Never is handled as a bottom type directly within the unifier. + +3. **Error type representation.** Resolved — error sets are a component of + `Ty::Fn`, making it `Fn(params, ret, EffectSet, ErrorSet)` (§4.15). `?` + propagation uses canonical subset/union semantics, while signature hashing stays + conservative over the written error clause. -2. **Never type semantics.** ✅ Resolved — `unify(τ, Never) = ok` for all `τ` (§4.16). Never is handled as a bottom type directly within the unifier. +4. **Generic syntax.** Resolved by SEP-0001: generic application uses square + brackets, for example `List[I64]` and `Result[T, E]`. Angle-bracket examples + are non-canonical and should be migrated. -3. **Error type representation.** ✅ Resolved — error sets are a component of -`Ty::Fn`, making it `Fn(params, ret, EffectSet, ErrorSet)` (§4.15). `?` -propagation uses canonical subset/union semantics, while signature hashing stays -conservative over the written error clause. +5. **L0 refinement representation.** Resolved by this specification: + `Ty::Refined` is the compiler representation for current L0 refinements. + Richer predicate classes may extend that model without reopening the surface + syntax. diff --git a/seps/SEP-0003-effect-system.md b/seps/SEP-0003-effect-system.md index f127c75..5bae13d 100644 --- a/seps/SEP-0003-effect-system.md +++ b/seps/SEP-0003-effect-system.md @@ -7,6 +7,7 @@ authors: - Zhan Rongrui created: 2026-03-31 requires: + - 1 - 2 discussion: "https://github.com/spore-lang/spore-evolution/discussions/3" pr: null @@ -19,7 +20,7 @@ superseded_by: null ## Summary -This SEP introduces Spore's **effect system**: a compile-time mechanism that tracks how functions interact with the outside world. Every function declares — via a `uses [...]` clause — the set of *atomic effects* it requires. The compiler verifies that a function body never exercises an effect absent from its declared set, auto-infers semantic properties (pure, deterministic, total) from that set, and uses set-inclusion as the basis for subtyping and effect narrowing. +This SEP introduces Spore's **effect system**: a compile-time mechanism that tracks how functions interact with the outside world. Every function declares — via a `uses [...]` clause — the set of _atomic effects_ it requires. The compiler verifies that a function body never exercises an effect absent from its declared set, auto-infers semantic properties (pure, deterministic, total) from that set, and uses set-inclusion as the basis for subtyping and effect narrowing. The built-in effect vocabulary is **intent-oriented**: each built-in effect answers "What does this code intend to do with the outside world?" Pure computation is the default state — no effect declaration is needed for it. Mutable state is tracked by the language semantics, not by built-in external effect names. In compiler internals and tooling protocols, these effect names form the effect set used for subset checks, platform ceilings, and machine-readable fields such as `effects`. @@ -27,17 +28,11 @@ The design is intentionally **flat and monomorphic**: effect sets are finite set Concrete surface syntax for `effect`, `handler`, `perform`, and `handle ... with` is defined in SEP-0001. This SEP focuses on semantics, algebra, typing, protocol fields, and diagnostics. -> **Release-safety note**: Handler discharge and the unified declaration form in -> this SEP describe the target behavior of the current compositional-semantics -> wave. The shipping implementation already checks explicit `perform` usage and -> effect sets, but parser/runtime support for the full unified handler surface is -> still being completed. - --- ## Motivation -Modern programs interleave pure computation with diverse side effects — file I/O, networking, mutable state, concurrency, randomness, process control. Without a disciplined tracking mechanism these effects become invisible at function boundaries, making it hard for both humans and automated agents to reason about what a function *can do*. +Modern programs interleave pure computation with diverse side effects — file I/O, networking, mutable state, concurrency, randomness, process control. Without a disciplined tracking mechanism these effects become invisible at function boundaries, making it hard for both humans and automated agents to reason about what a function _can do_. ### Problems addressed @@ -53,14 +48,14 @@ Modern programs interleave pure computation with diverse side effects — file I ### Design goals -| Goal | Mechanism | -|------|-----------| -| Explicit effect tracking | `uses [...]` clause on every function | -| Zero-cost purity | Omitting `uses` ≡ `uses []` (pure) | -| Composable aliases | `effect` keyword for named groups and aliases | -| Sound subtyping | Set inclusion on effect sets | +| Goal | Mechanism | +| ------------------------ | ---------------------------------------------------- | +| Explicit effect tracking | `uses [...]` clause on every function | +| Zero-cost purity | Omitting `uses` ≡ `uses []` (pure) | +| Composable aliases | `effect` keyword for named groups and aliases | +| Sound subtyping | Set inclusion on effect sets | | Auto-inferred properties | `pure`, `deterministic`, `total` derived from `uses` | -| Agent integration | `available_effects` emitted in `HoleReport` JSON | +| Agent integration | `available_effects` emitted in `HoleReport` JSON | --- @@ -95,7 +90,7 @@ fn greet(name: Str) uses [Console] { } ``` -If a function needs no effects it is *pure* and the `uses` clause may be omitted entirely: +If a function needs no effects it is _pure_ and the `uses` clause may be omitted entirely: ```spore fn add(a: I64, b: I64) -> I64 { @@ -108,18 +103,18 @@ fn add(a: I64, b: I64) -> I64 { Spore ships with the following intent-oriented atomic effects. Each one answers: "What does this code intend to do with the outside world?" -| Effect | Intent | Typical operations | -|---|---|---| -| `Console` | User interaction (terminal I/O) | `println`, `eprintln`, `read_line` | -| `FileRead` | Persistent data access | `File.read`, `Dir.list` | -| `FileWrite` | Persistent data modification | `File.write`, `Dir.create`, `File.delete` | -| `NetConnect` | External communication (outbound) | `http.get`, `http.post`, `tcp.connect` | -| `NetListen` | Service provision (inbound) | `tcp.listen`, `http.serve` | -| `Env` | Configuration access | `Env.get`, `Env.vars` | -| `Spawn` | Subprocess management | `Cmd.exec`, `spawn { ... }` | -| `Clock` | Time-dependent computation | `now()`, `elapsed()` | -| `Random` | Non-deterministic computation | `random()`, `uuid()` | -| `Exit` | Process lifecycle control | `exit()`, `abort()` | +| Effect | Intent | Typical operations | +| ------------ | --------------------------------- | ----------------------------------------- | +| `Console` | User interaction (terminal I/O) | `println`, `eprintln`, `read_line` | +| `FileRead` | Persistent data access | `File.read`, `Dir.list` | +| `FileWrite` | Persistent data modification | `File.write`, `Dir.create`, `File.delete` | +| `NetConnect` | External communication (outbound) | `http.get`, `http.post`, `tcp.connect` | +| `NetListen` | Service provision (inbound) | `tcp.listen`, `http.serve` | +| `Env` | Configuration access | `Env.get`, `Env.vars` | +| `Spawn` | Subprocess management | `Cmd.exec`, `spawn { ... }` | +| `Clock` | Time-dependent computation | `now()`, `elapsed()` | +| `Random` | Non-deterministic computation | `random()`, `uuid()` | +| `Exit` | Process lifecycle control | `exit()`, `abort()` | `Exit` authorizes explicit process termination. Startup contracts, runtime exit propagation, and host exit-code conversion are Platform/runtime concerns @@ -129,11 +124,11 @@ specified in SEP-0008 rather than in this effect SEP. The guiding principle is: **each built-in effect answers "What does this code intend to do with the outside world?"** This leads to several deliberate design choices: -1. **No `Compute` effect.** Pure computation is the default state — no effect declaration is needed. Every function can compute; effects describe what *additional* external powers a function needs beyond pure computation. A function with `uses []` (or no `uses` clause) is pure and can compute freely. +1. **No `Compute` effect.** Pure computation is the default state — no effect declaration is needed. Every function can compute; effects describe what _additional_ external powers a function needs beyond pure computation. A function with `uses []` (or no `uses` clause) is pure and can compute freely. -2. **No `StateRead`/`StateWrite`.** Mutable state is tracked by the language semantics, not by built-in external effect names. The built-in effects describe interactions with the *external world* — the filesystem, the network, the terminal, the clock. Internal mutable state (for example, a local cache) is an implementation detail, not an intent to interact with the outside world. +2. **No `StateRead`/`StateWrite`.** Mutable state is tracked by the language semantics, not by built-in external effect names. The built-in effects describe interactions with the _external world_ — the filesystem, the network, the terminal, the clock. Internal mutable state (for example, a local cache) is an implementation detail, not an intent to interact with the outside world. -3. **`NetConnect`/`NetListen` instead of `NetRead`/`NetWrite`.** The old names described data direction, but the real intent distinction is *client vs server*. An HTTP client both reads and writes the network, but its intent is "connect to an external service." Similarly, a server both reads and writes, but its intent is "listen for incoming connections." +3. **`NetConnect`/`NetListen` instead of `NetRead`/`NetWrite`.** The old names described data direction, but the real intent distinction is _client vs server_. An HTTP client both reads and writes the network, but its intent is "connect to an external service." Similarly, a server both reads and writes, but its intent is "listen for incoming connections." 4. **`Console` for terminal I/O.** `println("hello")` is user interaction, not file writing, even though stdout is technically a file descriptor. Distinguishing terminal I/O from filesystem I/O reflects a real difference in intent. @@ -143,27 +138,27 @@ The guiding principle is: **each built-in effect answers "What does this code in The `basic-cli` Platform currently maps its package modules to built-in effects as follows: -| Operation | Required effect | -|---|---| -| `basic_cli.stdout.print`, `basic_cli.stdout.println`, `basic_cli.stdout.eprint`, `basic_cli.stdout.eprintln` | `Console` | -| `basic_cli.stdin.read_line` | `Console` | -| `basic_cli.file.file_read`, `basic_cli.file.file_exists`, `basic_cli.file.file_stat` | `FileRead` | -| `basic_cli.file.file_write` | `FileWrite` | -| `basic_cli.dir.dir_list` | `FileRead` | -| `basic_cli.dir.dir_mkdir` | `FileWrite` | -| `basic_cli.env.env_get`, `basic_cli.env.env_set` | `Env` | -| `basic_cli.cmd.process_run`, `basic_cli.cmd.process_run_status` | `Spawn` | -| `basic_cli.cmd.exit` | `Exit` | +| Operation | Required effect | +| ------------------------------------------------------------------------------------------------------------ | --------------- | +| `basic_cli.stdout.print`, `basic_cli.stdout.println`, `basic_cli.stdout.eprint`, `basic_cli.stdout.eprintln` | `Console` | +| `basic_cli.stdin.read_line` | `Console` | +| `basic_cli.file.file_read`, `basic_cli.file.file_exists`, `basic_cli.file.file_stat` | `FileRead` | +| `basic_cli.file.file_write` | `FileWrite` | +| `basic_cli.dir.dir_list` | `FileRead` | +| `basic_cli.dir.dir_mkdir` | `FileWrite` | +| `basic_cli.env.env_get`, `basic_cli.env.env_set` | `Env` | +| `basic_cli.cmd.process_run`, `basic_cli.cmd.process_run_status` | `Spawn` | +| `basic_cli.cmd.exit` | `Exit` | ### Defining effect aliases The `effect` keyword creates a **named alias** that expands into a flat set of atomic effects: ```spore -effect FileIO = FileRead | FileWrite -effect CLI = Console | FileRead | FileWrite | Env | Spawn | Exit -effect Server = NetListen | FileRead | FileWrite | Clock | Random -effect HttpClient = NetConnect | Clock +effect FileIO = FileRead | FileWrite; +effect CLI = Console | FileRead | FileWrite | Env | Spawn | Exit; +effect Server = NetListen | FileRead | FileWrite | Clock | Random; +effect HttpClient = NetConnect | Clock; ``` Aliases expand recursively and flatten: @@ -173,16 +168,15 @@ CLI → {Console, FileRead, FileWrite, Env, Spawn, Exit} ``` Aliases are purely syntactic sugar in `uses [...]`: after expansion, only atomic -effects remain. The current implementation also ships a small builtin alias -hierarchy for `IO`, `FileIO`, and `NetIO`, but **same-module declarations -shadow those builtin names**. In other words, a local `effect IO = Console` or -`effect IO { ... }` is treated as that local declaration, not as the builtin -filesystem/network bundle. +effects remain. A small builtin alias hierarchy provides `IO`, `FileIO`, and +`NetIO` for convenience, but **same-module declarations shadow those builtin +names**. In other words, a local `effect IO = Console` or `effect IO { ... }` is +treated as that local declaration, not as the builtin filesystem/network bundle. ### Using effects in practice ```spore -effect HttpClient = NetConnect | Clock +effect HttpClient = NetConnect | Clock; fn query_api(url: Url) -> Data ! NetworkError uses [HttpClient] @@ -262,19 +256,19 @@ Why does this type-check? The compiler automatically derives semantic properties from the declared effect set. No manual annotation is required: -| Declared `uses` | Inferred properties | -|---|---| -| `uses []` (or omitted) | pure, deterministic, total* | -| `uses [Console]` | ¬pure | -| `uses [FileRead]` | ¬pure, deterministic | -| `uses [Random]` | ¬pure, ¬deterministic | -| `uses [NetConnect, Spawn]` | ¬pure, deterministic | +| Declared `uses` | Inferred properties | +| -------------------------- | ---------------------------- | +| `uses []` (or omitted) | pure, deterministic, total\* | +| `uses [Console]` | ¬pure | +| `uses [FileRead]` | ¬pure, deterministic | +| `uses [Random]` | ¬pure, ¬deterministic | +| `uses [NetConnect, Spawn]` | ¬pure, deterministic | -(*total requires a separate termination analysis; see Reference-level explanation §5.4.) +(\*total requires a separate termination analysis; see Reference-level explanation §5.4.) ### Incomplete functions — missing `uses` declarations -A function that calls operations requiring effects but does **not** declare a `uses` clause is an *incomplete function*. The compiler treats this as an error with a fix suggestion: +A function that calls operations requiring effects but does **not** declare a `uses` clause is an _incomplete function_. The compiler treats this as an error with a fix suggestion: ```spore fn save_data(data: Data) -> Unit { @@ -380,7 +374,7 @@ uses [] Let **E** be the universe of atomic effects. Each element of **E** is an indivisible identifier representing one way a program may interact with the external world. -An **effect set** *S* is a finite subset of **E**: +An **effect set** _S_ is a finite subset of **E**: $$S \subseteq E, \quad |S| < \infty$$ @@ -390,21 +384,21 @@ The empty set `{}` denotes a pure function — no interaction with the outside w Effect sets obey standard finite-set algebra: -| Property | Formula | Consequence | -|---|---|---| -| Commutativity | {A, B} = {B, A} | Declaration order is irrelevant | -| Idempotence | {A, A} = {A} | Duplicate declarations collapse | -| Associativity | Nested aliases flatten | `[FileIO, NetConnect]` = `[FileRead, FileWrite, NetConnect]` | -| Identity element | {} (empty set) | The identity for union: S ∪ {} = S | +| Property | Formula | Consequence | +| ---------------- | ---------------------- | ------------------------------------------------------------ | +| Commutativity | {A, B} = {B, A} | Declaration order is irrelevant | +| Idempotence | {A, A} = {A} | Duplicate declarations collapse | +| Associativity | Nested aliases flatten | `[FileIO, NetConnect]` = `[FileRead, FileWrite, NetConnect]` | +| Identity element | {} (empty set) | The identity for union: S ∪ {} = S | #### Set operations -| Operation | Symbol | Use | -|---|---|---| -| Union | S₁ ∪ S₂ | Sequential composition, conditional branches | -| Subset | S₁ ⊆ S₂ | Subtype check, effect narrowing | -| Intersection | S₁ ∩ S₂ | Property inference | -| Difference | S₁ \ S₂ | Reserved; not currently exposed in syntax | +| Operation | Symbol | Use | +| ------------ | ------- | -------------------------------------------- | +| Union | S₁ ∪ S₂ | Sequential composition, conditional branches | +| Subset | S₁ ⊆ S₂ | Subtype check, effect narrowing | +| Intersection | S₁ ∩ S₂ | Property inference | +| Difference | S₁ \ S₂ | Reserved; not currently exposed in syntax | ### 3. Effect alias definition and expansion @@ -419,7 +413,7 @@ $$\texttt{uses } [C] \equiv \texttt{uses } [A_1, A_2, \ldots, A_n]$$ Expansion is **recursive**: if any $A_i$ is itself an alias, it is expanded until every element is an atomic effect. The result is always a flat set. -The shipped implementation also reserves builtin aliases `IO`, `FileIO`, and +The language reserves builtin aliases `IO`, `FileIO`, and `NetIO` for convenience expansion, but those names are **not global keywords**: same-module declared effects and effect aliases with the same names shadow the builtin hierarchy. @@ -460,13 +454,13 @@ where: #### 4.2 Internal representation -In the compiler's type IR the function type is represented as: +In the type system, the function type carries an effect set: -```rust -Ty::Fn(Vec, Box, EffectSet) +```text +(T₁, T₂, …, Tₙ) -> R uses S ``` -where `EffectSet = BTreeSet`. A `BTreeSet` is chosen over `HashSet` for deterministic ordering in diagnostics and serialisation. +where the effect components use a deterministically ordered set for consistent diagnostics and serialisation. #### 4.3 Shorthand @@ -488,16 +482,16 @@ $$\mathcal{P}(\text{total}, S) = \text{determined by a separate termination anal The complete rule: `𝒫(pure, S) = true` iff `S = ∅`. Pure computation requires no declared effect — it is the default. `Spawn` is not pure-compatible: creating schedulable work has observable concurrency and scheduling consequences even when the spawned body is deterministic. Determinism remains a separate property. -| Effect set S | pure? | Rationale | -|---|---|---| -| `{}` | true | No effects at all | -| `{Spawn}` | false | `spawn` introduces observable concurrency/scheduling behavior | -| `{Console}` | false | Console interacts with the terminal — an external I/O channel | -| `{Clock}` | false | Clock reads the external world (system time) | -| `{Random}` | false | Random reads external entropy | -| `{Env}` | false | Env reads the process environment — an external configuration source | -| `{Exit}` | false | `exit` terminates the process — an observable external effect | -| `{NetConnect}` | false | Outbound network access is external I/O | +| Effect set S | pure? | Rationale | +| -------------- | ----- | -------------------------------------------------------------------- | +| `{}` | true | No effects at all | +| `{Spawn}` | false | `spawn` introduces observable concurrency/scheduling behavior | +| `{Console}` | false | Console interacts with the terminal — an external I/O channel | +| `{Clock}` | false | Clock reads the external world (system time) | +| `{Random}` | false | Random reads external entropy | +| `{Env}` | false | Env reads the process environment — an external configuration source | +| `{Exit}` | false | `exit` terminates the process — an observable external effect | +| `{NetConnect}` | false | Outbound network access is external I/O | #### 5.1 Implication chain @@ -535,7 +529,7 @@ Effect sets induce subtyping via set inclusion. Function types are **contravaria $$S_1 \subseteq S_2 \implies (\tau \to \rho \ \textbf{uses}\ S_1) <: (\tau \to \rho \ \textbf{uses}\ S_2)$$ -A function that requires *fewer* effects is more general and can be used wherever a more-capable function is expected. +A function that requires _fewer_ effects is more general and can be used wherever a more-capable function is expected. ```text S₁ ⊆ S₂ @@ -587,14 +581,14 @@ Read: "Under type context Γ and effect set S, expression e has type T." When multiple expressions are combined the compiler computes the composite effect set: -| Composition form | Effect computation | -|---|---| -| `A; B` | S_A ∪ S_B | -| `if c then A else B` | S_c ∪ S_A ∪ S_B | -| `match x { p₁ => A, p₂ => B, … }` | S_x ∪ S_A ∪ S_B ∪ … | -| `f(x)` where f `uses S_f` | Requires S_f ⊆ S_scope | +| Composition form | Effect computation | +| -------------------------------------- | -------------------------------------------- | +| `A; B` | S_A ∪ S_B | +| `if c then A else B` | S_c ∪ S_A ∪ S_B | +| `match x { p₁ => A, p₂ => B, … }` | S_x ∪ S_A ∪ S_B ∪ … | +| `f(x)` where f `uses S_f` | Requires S_f ⊆ S_scope | | `spawn { body }` where body `uses S_b` | Requires `Spawn ∈ S_scope` and S_b ⊆ S_scope | -| `let x = e₁ in e₂` | S_e₁ ∪ S_e₂ | +| `let x = e₁ in e₂` | S_e₁ ∪ S_e₂ | Formal rules for the two most important cases: @@ -653,7 +647,7 @@ residual(handle e with h̄) = (S_body \ H) ∪ S_impl Intuition: 1. effects covered by the active handlers stop leaking outward; -2. any effects required to *run the handlers themselves* still count; and +2. any effects required to _run the handlers themselves_ still count; and 3. unhandled effects remain visible to the outer scope. This rule is purely semantic. Users still declare ordinary `uses [...]` @@ -698,7 +692,7 @@ If the check fails, the compiler emits a `effect-violation` diagnostic listing t ### 10. Closure effect capture -A closure defined within a context with effect set *S* has an inferred effect set *S'* where *S'* ⊆ *S*. The inference is determined by the effects actually exercised in the closure body: +A closure defined within a context with effect set _S_ has an inferred effect set _S'_ where _S'_ ⊆ _S_. The inference is determined by the effects actually exercised in the closure body: ```spore fn example() -> Unit @@ -770,20 +764,20 @@ ERROR [effect-violation] Hole ?fetch_logic filled code uses unauthorised effects The cost model maintains a **four-dimensional cost vector**: `(compute, alloc, io, parallel)`. -| Dimension | Abbreviation | Meaning | Unit | -|---|---|---|---| -| Compute | `C` | CPU operation steps | op (operation) | -| Allocation | `A` | Heap memory allocation | cell (abstract memory unit) | -| I/O | `W` | Side-effect / external call count | call | -| Parallelism | `P` | Parallel execution width | lane | +| Dimension | Abbreviation | Meaning | Unit | +| ----------- | ------------ | --------------------------------- | --------------------------- | +| Compute | `C` | CPU operation steps | op (operation) | +| Allocation | `A` | Heap memory allocation | cell (abstract memory unit) | +| I/O | `W` | Side-effect / external call count | call | +| Parallelism | `P` | Parallel execution width | lane | Effect sets provide hard upper-bound constraints on these cost dimensions: -| Condition | Cost dimension constraint | -|---|---| -| `uses {}` | `io = 0` (guaranteed no I/O overhead) | -| S ∩ {NetConnect, NetListen, FileRead, FileWrite, Console} ≠ ∅ | `io > 0` possible | -| `Spawn ∈ S` | `parallel > 0` possible | +| Condition | Cost dimension constraint | +| ------------------------------------------------------------- | ------------------------------------- | +| `uses {}` | `io = 0` (guaranteed no I/O overhead) | +| S ∩ {NetConnect, NetListen, FileRead, FileWrite, Console} ≠ ∅ | `io > 0` possible | +| `Spawn ∈ S` | `parallel > 0` possible | The relationship is a **necessary condition**: if the effect set excludes all I/O effects, the cost model's I/O dimension is provably zero. @@ -812,17 +806,17 @@ effect FileWrite { } effect Env { - fn get(name: Str) -> Option[Str] - fn vars() -> List[(Str, Str)] + fn get(name: Str) -> Option[Str]; + fn vars() -> List[(Str, Str)]; } ``` -Declaring `uses [FileRead]` authorizes the effect, but operations remain explicit: `perform FileRead.read_file(path)` dispatches to the active `FileRead` handler. Canonical v0.1 semantics require the corresponding `effect` interface to be declared explicitly; undeclared pseudo-effect paths are compatibility-only. +Declaring `uses [FileRead]` authorizes the effect, but operations remain explicit: `perform FileRead.read_file(path)` dispatches to the active `FileRead` handler. Canonical semantics require the corresponding `effect` interface to be declared explicitly; undeclared pseudo-effect paths are compatibility-only. -Effect aliases are part of the committed v0.1 surface: +Effect aliases are part of the committed surface: ```spore -effect FileIO = FileRead | FileWrite +effect FileIO = FileRead | FileWrite; ``` This expands semantically to the union of the two effects and does not define a new operation surface of its own. @@ -862,7 +856,7 @@ uses [] The first example is a mock handler. The second shows the same declaration shape applied to a different effect family. A Platform package installs -handlers using the same model; the difference is only *where* the handler +handlers using the same model; the difference is only _where_ the handler instance comes from (project startup / adapter wiring) rather than any special Platform-only semantics. @@ -900,9 +894,9 @@ handle { } ``` -#### User-defined effects (allowed from v1) +#### User-defined effects -Users may define their own effects from the first version of Spore: +Users may define their own effects: ```spore effect RateLimit { @@ -1020,15 +1014,15 @@ The language server protocol integration should expose effect information throug ### New diagnostics -| Code | Severity | Message template | -|---|---|---| -| `effect-violation` | Error | Function body uses effect `{cap}` not declared in `uses` clause. Declared: `{declared}`. Required: `{required}`. Excess: `{excess}`. | -| `cap-closure-violation` | Error | Closure passed to `{fn_name}` must be pure (`uses []`), but it uses `{caps}`. | -| `cap-spawn-missing` | Error | `spawn` expression requires `Spawn` effect, but current scope declares `uses {scope_caps}`. | -| `cap-narrowing-violation` | Error | Spawn body uses `{child_caps}` which is not a subset of parent scope `{parent_caps}`. Excess: `{excess}`. | -| `cap-unknown` | Error | Unknown effect `{name}`. Did you mean `{suggestion}`? | -| `cap-alias-cycle` | Error | Effect alias `{name}` contains a cycle: `{cycle_path}`. | -| `cap-redundant` | Warning | Effect `{cap}` is declared but never used in the function body. | +| Code | Severity | Message template | +| ------------------------- | -------- | ------------------------------------------------------------------------------------------------------------------------------------ | +| `effect-violation` | Error | Function body uses effect `{cap}` not declared in `uses` clause. Declared: `{declared}`. Required: `{required}`. Excess: `{excess}`. | +| `cap-closure-violation` | Error | Closure passed to `{fn_name}` must be pure (`uses []`), but it uses `{caps}`. | +| `cap-spawn-missing` | Error | `spawn` expression requires `Spawn` effect, but current scope declares `uses {scope_caps}`. | +| `cap-narrowing-violation` | Error | Spawn body uses `{child_caps}` which is not a subset of parent scope `{parent_caps}`. Excess: `{excess}`. | +| `cap-unknown` | Error | Unknown effect `{name}`. Did you mean `{suggestion}`? | +| `cap-alias-cycle` | Error | Effect alias `{name}` contains a cycle: `{cycle_path}`. | +| `cap-redundant` | Warning | Effect `{cap}` is declared but never used in the function body. | ### Diagnostic quality guidelines @@ -1071,7 +1065,7 @@ error[effect-violation]: function body uses undeclared effect 5. **Alias is expansion only.** Effect aliases do not create new abstract effects. This prevents hiding implementation details behind an alias boundary — the caller always sees the expanded set. -6. **String-based EffectSet.** Using `BTreeSet` for the internal representation is simple but offers no compile-time interning or efficient bitset operations. This may need revisiting for large-scale codebases with many effects. +6. **EffectSet representation.** Using identifier-based effect sets is simple but offers no compile-time interning or efficient bitset operations. This may need revisiting for large-scale codebases with many effects. --- @@ -1113,7 +1107,7 @@ Algebraic effect handlers (as in Koka or OCaml 5) allow effects to be intercepte Spore's handler model is intentionally simpler than full algebraic effects: handlers are lexical, non-resumable, and one-shot. A matching handler arm computes the value of the corresponding `perform` expression directly; there is -no continuation capture or `resume()` path in canonical v0.1 semantics. +no continuation capture or `resume()` path in canonical semantics. Discharge is explicit: handled effects are removed from the local residual set, while handler implementation effects remain visible to the enclosing scope. Normal, mock, and Platform handlers all follow this same rule. @@ -1130,18 +1124,18 @@ Encoding effects in the type system via monads (e.g., `IO a`, `State s a`). ## Prior art -| System | Approach | Relation to this SEP | -|---|---|---| -| **Koka** | Algebraic effects with row-polymorphic effect types | Spore takes the same "effects in the type" philosophy with flat sets and simplified handlers (no continuations) | -| **Eff** | First-class algebraic effects and handlers | Spore adopts simplified handlers (no continuations); effects are statically checked with runtime handler dispatch | -| **Rust** | No built-in effect system; `unsafe` is the only effect marker | Spore generalises `unsafe` to a full effect vocabulary | -| **Haskell** | `IO` monad, mtl-style monad transformers | Spore replaces monadic encoding with flat set annotation | -| **OCaml 5** | Algebraic effects for concurrency | Spore's `Spawn` effect is analogous; Spore adopts simplified handlers without continuations | -| **Scala (ZIO)** | Environment type `R` in `ZIO[R, E, A]` | Similar in spirit; ZIO's `R` is an intersection type, Spore uses a flat set | -| **Unison** | Ability types (algebraic effects) | Close conceptual ancestor; Spore simplifies by removing polymorphism | -| **Android Manifest** | Permission declarations | Same "declare what you need" philosophy at the OS level | -| **Wasm Component Model** | Import/export effects | Static effect declaration before instantiation | -| **Java (checked exceptions)** | `throws` clause | Spore's `uses` is analogous but tracks effects rather than error types | +| System | Approach | Relation to this SEP | +| ----------------------------- | ------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------- | +| **Koka** | Algebraic effects with row-polymorphic effect types | Spore takes the same "effects in the type" philosophy with flat sets and simplified handlers (no continuations) | +| **Eff** | First-class algebraic effects and handlers | Spore adopts simplified handlers (no continuations); effects are statically checked with runtime handler dispatch | +| **Rust** | No built-in effect system; `unsafe` is the only effect marker | Spore generalises `unsafe` to a full effect vocabulary | +| **Haskell** | `IO` monad, mtl-style monad transformers | Spore replaces monadic encoding with flat set annotation | +| **OCaml 5** | Algebraic effects for concurrency | Spore's `Spawn` effect is analogous; Spore adopts simplified handlers without continuations | +| **Scala (ZIO)** | Environment type `R` in `ZIO[R, E, A]` | Similar in spirit; ZIO's `R` is an intersection type, Spore uses a flat set | +| **Unison** | Ability types (algebraic effects) | Close conceptual ancestor; Spore simplifies by removing polymorphism | +| **Android Manifest** | Permission declarations | Same "declare what you need" philosophy at the OS level | +| **Wasm Component Model** | Import/export effects | Static effect declaration before instantiation | +| **Java (checked exceptions)** | `throws` clause | Spore's `uses` is analogous but tracks effects rather than error types | --- @@ -1153,17 +1147,7 @@ Spore is a new language and this SEP defines a core feature of its type system. ### Interaction with SEP-0002 -This SEP depends on SEP-0002 (Type System). The `EffectSet` is integrated into the function type representation defined there: - -```rust -enum Ty { - I64, - Bool, - Str, - Fn(Vec, Box, EffectSet), // params, return, effects - // ... -} -``` +This SEP depends on SEP-0002 (Type System). The effect set is integrated into the function type representation: `Fn(params, ret, S)` where `S` is the set of required effect names. ### Future extensibility @@ -1175,36 +1159,28 @@ enum Ty { ## Unresolved questions -### 1. Effect subtraction syntax - -Should Spore support a subtraction syntax such as `uses [All \ Spawn]`? This depends on the definition of the universal set **E**, which may vary across platforms. Current decision: **not supported**. Developers must enumerate effects explicitly. - -### 2. Platform effect ceilings - -How does a module-level `platform [Web]` declaration interact with function-level `uses` clauses? Two options: +### 1. Effect identity, scoping, and imports -- **Intersection model:** The effective effect set is `S_function ∩ S_platform`. -- **Constraint model:** The compiler checks `S_function ⊆ S_platform` and rejects violations. +Third-party packages may define new atomic effects. The remaining design +question is how those names are made globally unambiguous across packages: +qualified module paths, package-qualified names, or another identity layer tied +to the module/package system in SEP-0008. -The constraint model (static rejection) is preferred but needs formal specification. +### 2. Fine-grained effect policy -### 3. Effect scoping and imports +The language-level model currently uses coarse intent-oriented effects such as +`FileRead`, `FileWrite`, `Console`, and `NetConnect`. More granular policies +such as path-scoped filesystem access or separate console read/write permissions +remain a future extension. SEP-0008 currently treats those as manifest/platform +policy rather than core effect algebra. -Can third-party libraries define new atomic effects? If so, how are they scoped and imported? Should there be a namespace mechanism (`mylib::MyEffect`)? +### 3. Effect evolution and deprecation -### 4. Granularity of file-system effects +When an atomic effect is deprecated, renamed, or split, the language needs a +migration story: diagnostics, compatibility aliases, possible `@deprecated` +metadata, and any automated rewrites. -Is `FileRead` / `FileWrite` the right granularity, or should effects be path-scoped (e.g., `FileRead("/etc/config")`)? Path-scoping adds expressiveness but complicates the set algebra. - -### 5. Granularity of `Console` effect - -Should `Console` be split further (e.g., `ConsoleRead` for stdin vs `ConsoleWrite` for stdout/stderr)? A single `Console` is simpler and covers the common case where terminal I/O is bidirectional, but finer granularity may be useful for sandboxing scenarios where a program should print output but not read input. - -### 6. Effect evolution and deprecation - -When an atomic effect is deprecated or split (e.g., `FileIO` split into `FileRead` + `FileWrite`), what migration tooling is needed? Should the compiler support `@deprecated` annotations on effect definitions? - -### 7. Interaction with generics +### 4. Interaction with generics How do generic type parameters interact with effect sets? For example: @@ -1216,25 +1192,33 @@ fn apply[T, R](f: (T) -> R, x: T) -> R { This currently works only with pure `f`. If `f` has effects, should `apply` need to declare them? Without effect polymorphism, the answer is that `apply` must be specialised for each effect set, which may require monomorphisation or overloading. -### 8. Runtime effect tokens +### 5. Runtime effect tokens Should effect tokens have a runtime representation (e.g., for dependency injection in tests), or are they purely a compile-time concept? A hybrid model where effects are erased by default but can be reified for testing purposes may be desirable. +### Resolved or delegated questions + +- **Effect subtraction syntax** is not part of the accepted surface. Spore does not support + `uses [All \ Spawn]`; developers enumerate effects explicitly. +- **Platform ceilings** are delegated to SEP-0008. If standardized, the expected + model is static constraint checking (`S_function ⊆ S_platform`), not + silently intersecting a function's declared effect set. + --- ## Appendix A: Formal notation quick reference -| # | Notation | Meaning | -|---|----------|---------| -| 1 | **E** | Universe of atomic effects | -| 2 | S, S₁, S₂ | Effect sets (finite subsets of **E**) | -| 3 | {} or ∅ | Empty effect set (pure function) | -| 4 | S₁ ⊆ S₂ | S₁ is a subset of S₂ | -| 5 | S₁ ∪ S₂ | Union of S₁ and S₂ | -| 6 | S₁ ∩ S₂ | Intersection of S₁ and S₂ | -| 7 | (T → R uses S) | Function type: parameter T, return R, effect set S | -| 8 | Γ; S ⊢ e : T | Typing judgement: under context Γ and effect set S, expression e has type T | -| 9 | 𝒫(prop, S) | Property inference function: determines property `prop` from effect set S | -| 10 | <: | Subtype relation | -| 11 | (C, A, W, P) | Four-dimensional cost vector: compute(op), alloc(cell), io(call), parallel(lane) | -| 12 | `effect C = A₁ | A₂ | ... | Aₙ` | Named alias definition expanding to a flat set of atomic effects | +| # | Notation | Meaning | +| --- | -------------- | -------------------------------------------------------------------------------- | +| 1 | **E** | Universe of atomic effects | +| 2 | S, S₁, S₂ | Effect sets (finite subsets of **E**) | +| 3 | {} or ∅ | Empty effect set (pure function) | +| 4 | S₁ ⊆ S₂ | S₁ is a subset of S₂ | +| 5 | S₁ ∪ S₂ | Union of S₁ and S₂ | +| 6 | S₁ ∩ S₂ | Intersection of S₁ and S₂ | +| 7 | (T → R uses S) | Function type: parameter T, return R, effect set S | +| 8 | Γ; S ⊢ e : T | Typing judgement: under context Γ and effect set S, expression e has type T | +| 9 | 𝒫(prop, S) | Property inference function: determines property `prop` from effect set S | +| 10 | <: | Subtype relation | +| 11 | (C, A, W, P) | Four-dimensional cost vector: compute(op), alloc(cell), io(call), parallel(lane) | +| 12 | `effect C = A₁ | A₂ | ... | Aₙ` | Named alias definition expanding to a flat set of atomic effects | diff --git a/seps/SEP-0004-cost-analysis.md b/seps/SEP-0004-cost-analysis.md index 30aeacb..8d0755b 100644 --- a/seps/SEP-0004-cost-analysis.md +++ b/seps/SEP-0004-cost-analysis.md @@ -7,6 +7,7 @@ authors: - Zhan Rongrui created: 2026-03-31 requires: + - 1 - 2 - 3 discussion: "https://github.com/spore-lang/spore-evolution/discussions/4" @@ -22,15 +23,11 @@ superseded_by: null This SEP specifies Spore's compile-time cost analysis system — a three-tier mechanism that statically determines or verifies upper bounds on resource consumption for every function. The system operates along four cost dimensions — **compute(op)**, **alloc(cell)**, **io(call)**, **parallel(lane)** — and leverages the fact that Spore has **no loops** (all iteration is expressed via recursion and higher-order functions) to make cost analysis equivalent to recursion analysis. -> **Note:** This SEP is `Draft`. Compiler behavior follows the implementation -> repository (`spore/README.md`) until acceptance. Declare per-function budgets -> with four-slot `cost [compute, alloc, io, parallel]` (SEP-0001). - The three tiers are: 1. **Tier 1 — Automatic structural recursion detection** (~70% of functions): the compiler detects that one argument strictly decreases along a well-founded relation on every recursive call and automatically infers a cost bound. 2. **Tier 2 — Declarative verification** (~20%): the developer writes `cost [compute, alloc, io, parallel]` in the function signature; the compiler verifies each slot independently. -3. **Tier 3 — `@unbounded` escape hatch** (~10%): the developer explicitly opts out of cost checking; the annotation is *contagious* — callers inherit `@unbounded` unless they isolate it with `with_cost_limit`. +3. **Tier 3 — `@unbounded` escape hatch** (~10%): the developer explicitly opts out of cost checking; the annotation is _contagious_ — callers inherit `@unbounded` unless they isolate it with `with_cost_limit`. Cost expressions (`CostExpr`) are drawn from a restricted grammar over compile-time `Index` parameters — `+`, `*`, `log`, `max`, `min`, and `span(hi, lo)` — deliberately excluding arbitrary runtime values, division, ordinary subtraction, and conditionals. This restriction keeps verification decidable and makes cost a compile-time symbolic upper-bound function rather than runtime profiling. @@ -40,7 +37,7 @@ Cost expressions (`CostExpr`) are drawn from a restricted grammar over compile-t ### The problem with runtime profiling -Traditional performance analysis relies on runtime profiling: run the program, measure timings, hope the workload is representative. This is machine-dependent, non-reproducible, and fundamentally reactive — you discover performance regressions *after* they ship. +Traditional performance analysis relies on runtime profiling: run the program, measure timings, hope the workload is representative. This is machine-dependent, non-reproducible, and fundamentally reactive — you discover performance regressions _after_ they ship. ### Why Spore can do better @@ -53,13 +50,13 @@ Spore's language design creates a unique opportunity for compile-time cost analy ### Design goals -| Goal | Description | -|------|-------------| +| Goal | Description | +| ------------- | --------------------------------------------------------------------------------------- | | High coverage | ~90% of real-world recursive code gets a cost bound automatically or semi-automatically | -| Zero burden | Simple cases require no manual annotation | -| Escapable | Unanalyzable code does not block compilation — it produces a warning | -| Composable | Recursive cost and higher-order function cost compose seamlessly | -| Decidable | The verification algorithm always terminates in polynomial time | +| Zero burden | Simple cases require no manual annotation | +| Escapable | Unanalyzable code does not block compilation — it produces a warning | +| Composable | Recursive cost and higher-order function cost compose seamlessly | +| Decidable | The verification algorithm always terminates in polynomial time | ### The core equation @@ -153,7 +150,7 @@ fn collatz_steps(n: I64) -> I64 { } ``` -`@unbounded` functions cannot be called directly from ordinary four-slot cost declarations without isolation. The intended bridge is a **runtime cost limiter** (below). The reference compiler in the `spore` implementation repository does not implement this form yet; see that repo’s `docs/DESIGN.md`. +`@unbounded` functions cannot be called directly from ordinary four-slot cost declarations without isolation. The bridge is a **runtime cost limiter** (below): it lexically bounds evaluation of unbounded callees while preserving a checkable four-slot contract for the enclosing function. Contagion and isolation rules are normative in this SEP; how implementations surface violations in the pipeline (`K0xxx` and related) belongs to SEP-0006. ```spore fn safe_collatz(n: I64) -> I64 ! CostExceeded @@ -218,16 +215,32 @@ Design principle: **What can be inferred automatically shall never require manua ## Reference-level explanation +### Notation + +| Symbol | Meaning | +| --------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | +| `C` / `A` / `W` / `P` | Cost dimensions: Compute(op), Alloc(cell), IO(call), Parallel(lane). Note: `C` here denotes **Compute**, distinct from the EffectSet metavariable `C` used in SEP-0002's typing judgments. | +| `K(e)` | CostVector of expression `e` | +| `⊕` | Pointwise CostVector addition | +| `⊗` | CostVector scaling | +| `≤` | Pointwise CostVector comparison (each dimension ≤) | +| `N0` | Non-negative integers {0, 1, 2, …} | +| `N: Index` | Compile-time non-negative size parameter | +| `σ` | Assignment environment: IndexVar → N0 | +| `⟦e⟧σ` | Semantic evaluation of CostExpr `e` under assignment `σ` | +| `≺` | Well-founded decreasing relation | +| `≼` | Asymptotic dominance | + ### 4.1 Cost dimensions The abstract machine maintains four independent cost dimensions: -| Dimension | Abbreviation | Meaning | Unit | -|-----------|-------------|---------|------| -| Compute | `C` | CPU operation steps | op (operation) | -| Allocation | `A` | Heap memory allocation | cell (abstract memory unit) | -| I/O | `W` | Side-effect / external call count | call | -| Parallelism | `P` | Parallel execution width | lane | +| Dimension | Abbreviation | Meaning | Unit | +| ----------- | ------------ | --------------------------------- | --------------------------- | +| Compute | `C` | CPU operation steps | op (operation) | +| Allocation | `A` | Heap memory allocation | cell (abstract memory unit) | +| I/O | `W` | Side-effect / external call count | call | +| Parallelism | `P` | Parallel execution width | lane | **Scalar summaries (reports):** tooling may fold **C**, **A**, and **W** into one weighted number for display. Declarations remain the four-slot form; the fold is: @@ -244,33 +257,33 @@ where `α` and `β` are project-configurable weights (default α = 2, β = 100). #### Compute (C dimension) -| Operation | Cost (op) | Notes | -|-----------|----------|-------| -| Integer `+`, `-`, `*` | 1 | | -| Integer `/`, `%` | 2 | Division is slightly more expensive | -| F64 `+`, `-`, `*` | 2 | | -| F64 `/` | 3 | | -| Comparison `==`, `!=`, `<`, `>` | 1 | | -| Logical `&&`, `\|\|`, `!` | 1 | Max-path (short-circuit does not reduce cost) | -| Bitwise `&`, `\|`, `^`, `<<`, `>>` | 1 | | -| Variable read | 0 | Already in scope | -| `let` binding | 1 | | -| Pattern arm | 1 | Per arm matched | -| Function call overhead | 3 | Fixed, excludes callee body | -| Closure creation (N captures) | N + 2 | | -| Pipe `\|>` | 0 | Syntactic sugar | +| Operation | Cost (op) | Notes | +| ---------------------------------- | --------- | --------------------------------------------- | +| Integer `+`, `-`, `*` | 1 | | +| Integer `/`, `%` | 2 | Division is slightly more expensive | +| F64 `+`, `-`, `*` | 2 | | +| F64 `/` | 3 | | +| Comparison `==`, `!=`, `<`, `>` | 1 | | +| Logical `&&`, `\|\|`, `!` | 1 | Max-path (short-circuit does not reduce cost) | +| Bitwise `&`, `\|`, `^`, `<<`, `>>` | 1 | | +| Variable read | 0 | Already in scope | +| `let` binding | 1 | | +| Pattern arm | 1 | Per arm matched | +| Function call overhead | 3 | Fixed, excludes callee body | +| Closure creation (N captures) | N + 2 | | +| Pipe `\|>` | 0 | Syntactic sugar | #### Allocation (A dimension) -| Operation | Cost (cell) | -|-----------|------------| -| Struct creation | field count | -| List creation | element count + 1 header | -| `Str` creation | ⌈len / 8⌉ | +| Operation | Cost (cell) | +| ------------------- | -------------------------------------- | +| Struct creation | field count | +| List creation | element count + 1 header | +| `Str` creation | ⌈len / 8⌉ | | `Str` concatenation | ⌈(len_a + len_b) / 8⌉ (new allocation) | -| Enum / union | 1 (tag + max variant size) | -| Deep copy | original cell count | -| Borrow / reference | 0 | +| Enum / union | 1 (tag + max variant size) | +| Deep copy | original cell count | +| Borrow / reference | 0 | #### I/O (W dimension) @@ -278,14 +291,14 @@ Every system call (file read/write, network request, stdio, random number genera ### 4.3 Composition rules -| Form | Cost rule | -|------|-----------| -| Sequential `A; B` | `cost(A) + cost(B)` | -| Conditional `if c then A else B` | `cost(c) + max(cost(A), cost(B))` | -| Pattern match `match x { p₁ => A, p₂ => B, ... }` | `cost(x) + max(cost(A), cost(B), ...) + arms × 1` | -| Function call `f(args)` | `Σ cost(argᵢ) + 3 + cost(f.body)` | -| Pipe chain `x \|> f \|> g` | `cost(x) + cost(f) + cost(g)` | -| Parallel `parallel { A, B }` | C, A, W: `max(cost(A), cost(B)) + sync_overhead`; P: `sum(P(A), P(B))` | +| Form | Cost rule | +| ------------------------------------------------- | ---------------------------------------------------------------------- | +| Sequential `A; B` | `cost(A) + cost(B)` | +| Conditional `if c then A else B` | `cost(c) + max(cost(A), cost(B))` | +| Pattern match `match x { p₁ => A, p₂ => B, ... }` | `cost(x) + max(cost(A), cost(B), ...) + arms × 1` | +| Function call `f(args)` | `Σ cost(argᵢ) + 3 + cost(f.body)` | +| Pipe chain `x \|> f \|> g` | `cost(x) + cost(f) + cost(g)` | +| Parallel `parallel { A, B }` | C, A, W: `max(cost(A), cost(B)) + sync_overhead`; P: `sum(P(A), P(B))` | > **Concurrent sync overhead.** The `sync_overhead` is a configurable constant (default: 0) representing the synchronisation cost of joining parallel branches. It can be set in `spore.toml` as `[cost] sync_overhead = 10`. When set to 0 (the default), the parallel cost reduces to a simple `max`. Projects requiring precise modelling of fork/join overhead should configure this parameter. @@ -447,15 +460,15 @@ FnVar ::= [a-z][A-Za-z0-9_]* #### Explicitly forbidden constructs -| Construct | Reason | -|-----------|--------| -| Ordinary runtime values | Cost must be a compile-time symbolic upper-bound function over Index parameters | -| Division `/` | Avoids division-by-zero and rational expressions | -| Ordinary subtraction `-` | May produce negative values and non-monotone expressions | -| Conditionals `if...then...else` | Introduces undecidable branching — conditionals can encode arbitrary predicates | -| Recursive cost definitions | Avoids fixpoint computation; recursion analysis is handled at a separate layer | -| Negative numbers | Cost domain is `N0` (non-negative integers) | -| Variable exponents `n^m` | Pushes comparison into the exponential polynomial domain, losing polynomial decidability | +| Construct | Reason | +| ------------------------------- | ---------------------------------------------------------------------------------------- | +| Ordinary runtime values | Cost must be a compile-time symbolic upper-bound function over Index parameters | +| Division `/` | Avoids division-by-zero and rational expressions | +| Ordinary subtraction `-` | May produce negative values and non-monotone expressions | +| Conditionals `if...then...else` | Introduces undecidable branching — conditionals can encode arbitrary predicates | +| Recursive cost definitions | Avoids fixpoint computation; recursion analysis is handled at a separate layer | +| Negative numbers | Cost domain is `N0` (non-negative integers) | +| Variable exponents `n^m` | Pushes comparison into the exponential polynomial domain, losing polynomial decidability | `span(hi, lo)` is the only difference-like operation. Its meaning is `max(hi - lo, 0)`, and it exists only in the Index layer so APIs can express @@ -466,9 +479,9 @@ Residual budgeting does not weaken this rule: the checker may compute surface and therefore does not reintroduce ordinary subtraction into the source language. -#### Implementation (Rust) +#### CostExpr definition -The `CostExpr` type is implemented in `sporec-typeck/src/cost.rs`: +The `CostExpr` type has the following structure: ```rust pub enum CostExpr { @@ -509,7 +522,7 @@ Let σ: IndexVar → N0 be an assignment environment. The semantic function that does not contain `span`, if σ₁(N) ≤ σ₂(N) for all variables N, then ⟦e⟧σ₁ ≤ ⟦e⟧σ₂. -*Proof.* By structural induction on `e`. All operations (`+`, `×`, `log`, +_Proof._ By structural induction on `e`. All operations (`+`, `×`, `log`, `max`, `min`) are monotone non-decreasing on N0. ∎ For `span(hi, lo)`, the checker tracks variance: `hi` is covariant and `lo` is @@ -520,7 +533,7 @@ runtime-value reasoning. #### Tier 1: Structural recursion auto-detection -**Definition.** A function f(x₁, ..., xₙ) is *structurally recursive* if there exists i ∈ {1, ..., n} such that for every recursive call f(y₁, ..., yₙ): +**Definition.** A function f(x₁, ..., xₙ) is _structurally recursive_ if there exists i ∈ {1, ..., n} such that for every recursive call f(y₁, ..., yₙ): ```text yᵢ ≺ xᵢ (where ≺ is a well-founded relation on type Tᵢ) @@ -528,17 +541,17 @@ yᵢ ≺ xᵢ (where ≺ is a well-founded relation on type Tᵢ) The compiler recognizes the following decreasing patterns: -| Pattern | Source → Recursive arg | Well-founded relation | Typical cost | -|---------|----------------------|----------------------|-------------| -| Natural number decrement | `n → n - 1` (with `n > 0` guard) | `<` on ℕ | O(n) | -| List tail | `list → list.tail` | Sublist relation | O(n) | -| Tree child (unary) | `tree → tree.left` or `tree → tree.right` | Subtree relation | O(log n) balanced / O(n) worst | -| Tree child (binary) | `tree → tree.left` and `tree → tree.right` | Subtree relation | O(n) | -| Enum destructuring | `match x { Variant(inner) => f(inner) }` | Structural subterm | O(depth) | -| Tuple projection | `(a, b) → a` or `(a, b) → b` (strictly smaller) | Structural subterm | Depends on projected component | -| Integer halving | `n → n / 2` (with `n > 0` guard) | `<` on ℕ | O(log n) | +| Pattern | Source → Recursive arg | Well-founded relation | Typical cost | +| ------------------------ | ----------------------------------------------- | --------------------- | ------------------------------ | +| Natural number decrement | `n → n - 1` (with `n > 0` guard) | `<` on ℕ | O(n) | +| List tail | `list → list.tail` | Sublist relation | O(n) | +| Tree child (unary) | `tree → tree.left` or `tree → tree.right` | Subtree relation | O(log n) balanced / O(n) worst | +| Tree child (binary) | `tree → tree.left` and `tree → tree.right` | Subtree relation | O(n) | +| Enum destructuring | `match x { Variant(inner) => f(inner) }` | Structural subterm | O(depth) | +| Tuple projection | `(a, b) → a` or `(a, b) → b` (strictly smaller) | Structural subterm | Depends on projected component | +| Integer halving | `n → n / 2` (with `n > 0` guard) | `<` on ℕ | O(log n) | -**Detection algorithm** (implemented in `sporec-typeck/src/cost.rs`): +**Detection algorithm**: ```text algorithm detect_structural_recursion(f): @@ -622,12 +635,12 @@ fn collatz_steps(n: I64) -> I64 { **Rules:** -| Rule | Description | -|------|-------------| -| Warning, not error | `@unbounded` produces a compiler warning, does not block compilation | -| Contagious | Calling an `@unbounded` function makes the caller `@unbounded` too (unless wrapped in `with_cost_limit`) | -| Context restriction | `@unbounded` functions cannot be called directly inside ordinary `cost [...]` functions | -| Hole interaction | Holes inside `@unbounded` functions report `cost_budget: unbounded` | +| Rule | Description | +| ------------------- | -------------------------------------------------------------------------------------------------------- | +| Warning, not error | `@unbounded` produces a compiler warning, does not block compilation | +| Contagious | Calling an `@unbounded` function makes the caller `@unbounded` too (unless wrapped in `with_cost_limit`) | +| Context restriction | `@unbounded` functions cannot be called directly inside ordinary `cost [...]` functions | +| Hole interaction | Holes inside `@unbounded` functions report `cost_budget: unbounded` | **Impact scope tracking.** When a function is marked `@unbounded`, its unbounded status propagates through the call chain: @@ -648,16 +661,16 @@ WARNING [unbounded-function] collatz_steps is marked @unbounded. ### 4.7 Higher-order function cost formulas -Since Spore has no loops, higher-order functions are the *only* iteration mechanism besides recursion. Verified standard-library cost formulas are defined for indexed containers: +Since Spore has no loops, higher-order functions are the _only_ iteration mechanism besides recursion. Verified standard-library cost formulas are defined for indexed containers: -| Function | Cost formula | -|----------|-------------| -| `v.map(f)` where `v: Vec[T, max: N]` | `N × cost(f) + N` | -| `v.fold(init, f)` where `v: Vec[T, max: N]` | `N × cost(f)` | -| `v.filter(pred)` where `v: Vec[T, max: N]` | `N × cost(pred) + N` | -| `a.zip(b)` where `a: Vec[A, max: M]`, `b: Vec[B, max: N]` | `min(M, N)` | -| `v.take(count)` where `count: Count[K]` | `min(N, K)` | -| `v.reduce(f)` where `v: Vec[T, max: N]` | `N × cost(f)` | +| Function | Cost formula | +| --------------------------------------------------------- | -------------------- | +| `v.map(f)` where `v: Vec[T, max: N]` | `N × cost(f) + N` | +| `v.fold(init, f)` where `v: Vec[T, max: N]` | `N × cost(f)` | +| `v.filter(pred)` where `v: Vec[T, max: N]` | `N × cost(pred) + N` | +| `a.zip(b)` where `a: Vec[A, max: M]`, `b: Vec[B, max: N]` | `min(M, N)` | +| `v.take(count)` where `count: Count[K]` | `min(N, K)` | +| `v.reduce(f)` where `v: Vec[T, max: N]` | `N × cost(f)` | Since higher-order function arguments `f` in Spore must be pure (no effect variables), `cost(f)` is always statically determinable. Substituting `cost(f)` into the formula yields a valid CostExpr. @@ -721,11 +734,11 @@ fn is_odd(n: I64) -> Bool { Analysis: SCC = {is_even, is_odd}. Combined call pattern: `is_even(n) → is_odd(n-1) → is_even(n-2) → ...`. Parameter n decreases by 2 every two calls → structural recursion, cost = O(n). -| Scenario | Handling | -|----------|----------| -| SCC satisfies structural recursion | Automatic cost derivation | +| Scenario | Handling | +| ----------------------------------------- | --------------------------------------------------------------- | +| SCC satisfies structural recursion | Automatic cost derivation | | SCC does not satisfy structural recursion | All functions in SCC need explicit `cost [...]` or `@unbounded` | -| Any function in SCC is `@unbounded` | Entire SCC treated as `@unbounded` | +| Any function in SCC is `@unbounded` | Entire SCC treated as `@unbounded` | ### 4.9 Decidability proof sketch @@ -745,14 +758,14 @@ With nesting depth d bounded by a constant (default ≤ 8), this produces at mos Verification rules for max/min at the top level: -| Form | Rule | Condition | -|------|------|-----------| -| `max(A, B) ≤ C` | Verify A ≤ C **and** B ≤ C | Necessary and sufficient | -| `min(A, B) ≤ C` | Verify A ≤ C **or** B ≤ C | Sufficient (conservative) | -| `C ≤ max(A, B)` | Verify C ≤ A **or** C ≤ B | Sufficient (conservative) | -| `C ≤ min(A, B)` | Verify C ≤ A **and** C ≤ B | Necessary and sufficient | +| Form | Rule | Condition | +| --------------- | -------------------------- | ------------------------- | +| `max(A, B) ≤ C` | Verify A ≤ C **and** B ≤ C | Necessary and sufficient | +| `min(A, B) ≤ C` | Verify A ≤ C **or** B ≤ C | Sufficient (conservative) | +| `C ≤ max(A, B)` | Verify C ≤ A **or** C ≤ B | Sufficient (conservative) | +| `C ≤ min(A, B)` | Verify C ≤ A **and** C ≤ B | Necessary and sufficient | -**Step 2: Normal form conversion.** A *poly-log monomial* has the form: +**Step 2: Normal form conversion.** A _poly-log monomial_ has the form: ```text t = c × n₁^a₁ × ... × nₖ^aₖ × log(n₁)^b₁ × ... × log(nₖ)^bₖ @@ -761,7 +774,7 @@ t = c × n₁^a₁ × ... × nₖ^aₖ × log(n₁)^b₁ × ... × log(nₖ)^b where c is a positive N0 coefficient, and each ai and bi is in N0. We write this as the triple **(c, a_bar, b_bar)**. -The *normal form* of a CostExpr is a finite sum of poly-log monomials. Conversion algorithm NF: +The _normal form_ of a CostExpr is a finite sum of poly-log monomials. Conversion algorithm NF: ```text NF(c) = {(c, 0̄, 0̄)} @@ -786,7 +799,7 @@ After conversion, merge like terms: monomials with the same (ā, b̄) have their **Step 3: Asymptotic dominance check.** -**Definition 4.2 (Asymptotic dominance).** Monomial t₁ = (c₁, ā₁, b̄₁) is *dominated* by t₂ = (c₂, ā₂, b̄₂), written t₁ ≼ t₂, iff: +**Definition 4.2 (Asymptotic dominance).** Monomial t₁ = (c₁, ā₁, b̄₁) is _dominated_ by t₂ = (c₂, ā₂, b̄₂), written t₁ ≼ t₂, iff: ```text t₁ ≼ t₂ ⟺ ā₁ < ā₂ (componentwise ≤ and ≠) @@ -817,7 +830,7 @@ space is T^k, feasible for k <= 5. **Theorem 4.2 (Polynomial-time decidability).** The asymptotic comparison of CostExprs is decidable in O(n⁵) time, where n = |C| + |B|. -*Proof.* Let n = |C| + |B| (total AST nodes) and k = |V| (variable count). +_Proof._ Let n = |C| + |B| (total AST nodes) and k = |V| (variable count). 1. **max/min lifting**: O(n) with d bounded by a constant. 2. **Normal form conversion**: Multiplication distributes monomials (Cartesian product). An expression of size n yields at most O(n²) monomials. Logarithm simplification: O(n) substitutions. Like-term merging: sort in O(n² log n). @@ -834,7 +847,7 @@ This is in **P** (polynomial-time complexity class). ∎ threshold T in N0 such that for all non-negative Index assignments n_bar with min(n_bar) >= T, we have C(n_bar) <= B(n_bar). -*Proof sketch.* +_Proof sketch._ 1. Normal form conversion preserves asymptotic equivalence (logarithm simplifications introduce only constant-factor errors). 2. If every monomial in NF(C) is dominated by some monomial in NF(B): @@ -947,12 +960,12 @@ extern fn openssl_encrypt[N: Index](data: Bytes[N], key: Key) -> Bytes[N] ! Cryp **Rules for extern fn cost:** -| Rule | Description | -|------|-------------| -| Required declaration | `extern fn` without a `cost` clause is treated as `@unbounded` | -| No body analysis | The compiler trusts the declared cost — no verification is possible | -| Contagious unbounded | An `@unbounded` extern fn follows the same contagion rules as any `@unbounded` function | -| Variable binding | Cost variables are Index parameters such as `N`; ordinary runtime values do not bind into CostExpr | +| Rule | Description | +| -------------------- | -------------------------------------------------------------------------------------------------- | +| Required declaration | `extern fn` without a `cost` clause is treated as `@unbounded` | +| No body analysis | The compiler trusts the declared cost — no verification is possible | +| Contagious unbounded | An `@unbounded` extern fn follows the same contagion rules as any `@unbounded` function | +| Variable binding | Cost variables are Index parameters such as `N`; ordinary runtime values do not bind into CostExpr | This ensures FFI boundaries maintain cost transparency — external code cannot silently introduce cost black holes. @@ -1086,7 +1099,7 @@ CostExpr is serialized as a JSON AST for tool consumption: { "type": "Mul", "left": { "type": "Var", "name": "n" }, - "right": { "type": "Log", "arg": { "type": "IndexVar", "name": "N" } } + "right": { "type": "Log", "arg": { "type": "IndexVar", "name": "N" } } } ``` @@ -1104,14 +1117,14 @@ The Language Server Protocol exposes: ### New diagnostic categories -| Code | Severity | Trigger | -|------|----------|---------| -| `cost-exceeded` | Error | Inferred cost exceeds declared bound | -| `unbounded-cost` | Warning | Recursive function with no detectable cost bound and no `@unbounded` annotation | -| `unbounded-function` | Warning | Function marked `@unbounded` | -| `unbounded-in-bounded-context` | Error | `@unbounded` function called from `cost [...]` context without `with_cost_limit` | -| `unverified-cost-bound` | Warning | `cost [...]` declared but compiler cannot verify one or more slots | -| `cost-effect-conflict` | Error | `uses []` (pure) declared but W > 0 inferred | +| Code | Severity | Trigger | +| ------------------------------ | -------- | -------------------------------------------------------------------------------- | +| `cost-exceeded` | Error | Inferred cost exceeds declared bound | +| `unbounded-cost` | Warning | Recursive function with no detectable cost bound and no `@unbounded` annotation | +| `unbounded-function` | Warning | Function marked `@unbounded` | +| `unbounded-in-bounded-context` | Error | `@unbounded` function called from `cost [...]` context without `with_cost_limit` | +| `unverified-cost-bound` | Warning | `cost [...]` declared but compiler cannot verify one or more slots | +| `cost-effect-conflict` | Error | `uses []` (pure) declared but W > 0 inferred | ### Example diagnostics @@ -1137,7 +1150,7 @@ WARNING [unbounded-cost] fibonacci's cost cannot be statically determined. Inferred complexity: O(2^n) Options: - (a) Add tighter refinements (`n: I64 if n ≤ 30`) together with concrete `cost [...]` literals for small‑n paths + (a) Add a tighter refinement (`type SmallN = I64 when self <= 30`) together with concrete `cost [...]` literals for small-n paths (b) Mark as `@unbounded` (relinquish cost constraint) (c) Rewrite using structural recursion or tail recursion + iteration bound ``` @@ -1164,7 +1177,7 @@ The cost system and effect system form a cross-validation network. If a function 4. **`@unbounded` contagion.** One `@unbounded` function deep in the call chain can force many callers to also become `@unbounded`, potentially undermining the cost system's value. Mitigation: `with_cost_limit` isolation. -5. **No runtime validation (yet).** The system currently has no mechanism to verify that compile-time cost predictions match actual runtime performance. Cost drift detection is deferred to future work. +5. **No built-in runtime proof of static predictions.** Compile-time costs bound the abstract accounting model (`compute`, `alloc`, `io`, `parallel`); they are not automatically certified against wall-clock or host-resource measurements. Closing that gap belongs to tooling and profiling outside this SEP's normative scope. 6. **Multi-variable partial order.** With multiple variables, some monomials are incomparable (e.g., `n*m` vs `n²`), leading to conservative FAIL. Developers must restructure declarations. @@ -1320,14 +1333,14 @@ With k > 1 variables, the comparison uses componentwise partial ordering on expo ### Limitations -| Limitation | Description | Workaround | -|---|---|---| -| No conditional cost | Cannot express "if sorted then O(n), else O(n²)" | Use `max(n, n^2) = n^2` (conservative) | -| No amortized analysis | Cannot express "amortized O(1)" | Use worst-case cost | -| No probabilistic analysis | Cannot express "expected O(n log n)" | Use worst-case cost | -| No subtraction / division | Cannot exactly express `n*(n-1)/2` | Use `n^2` upper bound | -| max/min nesting depth bounded | Deep nesting causes expression blowup | Compiler limits depth (default 8) | -| Multi-variable incomparability | Partial order on exponent vectors can be inconclusive | Restructure declaration or use `max` | +| Limitation | Description | Workaround | +| ------------------------------ | ----------------------------------------------------- | -------------------------------------- | +| No conditional cost | Cannot express "if sorted then O(n), else O(n²)" | Use `max(n, n^2) = n^2` (conservative) | +| No amortized analysis | Cannot express "amortized O(1)" | Use worst-case cost | +| No probabilistic analysis | Cannot express "expected O(n log n)" | Use worst-case cost | +| No subtraction / division | Cannot exactly express `n*(n-1)/2` | Use `n^2` upper bound | +| max/min nesting depth bounded | Deep nesting causes expression blowup | Compiler limits depth (default 8) | +| Multi-variable incomparability | Partial order on exponent vectors can be inconclusive | Restructure declaration or use `max` | --- @@ -1363,36 +1376,43 @@ WARNING [cost-drift] function merge_sort: actual cost exceeds prediction. ## Design decisions -| Decision | Choice | Rationale | -|---|---|---| -| Recursion analysis tiers | 3 tiers (auto + declarative + escape) | Balances automation with expressiveness; ~90% coverage | -| Structural recursion detection | Syntactic parameter-decreasing check | Simple, reliable, O(\|call_graph\|) complexity | -| Verification failure handling | Warning, not error | Gradual adoption; does not block development | -| `@unbounded` semantics | Contagious + isolatable via `with_cost_limit` | Ensures cost information propagates while providing an escape path | -| `decreases` clause | Optional | Compiler auto-derives in most cases; manual only when needed | -| Mutual recursion | SCC-based whole-group analysis | Natural fit with call-graph analysis | -| Higher-order function cost | Compiler built-in formulas | No loops → HOFs are the only iteration mechanism; must be built-in | +| Decision | Choice | Rationale | +| ------------------------------ | --------------------------------------------- | ------------------------------------------------------------------ | +| Recursion analysis tiers | 3 tiers (auto + declarative + escape) | Balances automation with expressiveness; ~90% coverage | +| Structural recursion detection | Syntactic parameter-decreasing check | Simple, reliable, O(\|call_graph\|) complexity | +| Verification failure handling | Warning, not error | Gradual adoption; does not block development | +| `@unbounded` semantics | Contagious + isolatable via `with_cost_limit` | Ensures cost information propagates while providing an escape path | +| `decreases` clause | Optional | Compiler auto-derives in most cases; manual only when needed | +| Mutual recursion | SCC-based whole-group analysis | Natural fit with call-graph analysis | +| Higher-order function cost | Compiler built-in formulas | No loops → HOFs are the only iteration mechanism; must be built-in | --- ## Unresolved questions -1. **Tail-call optimization and cost.** TCO changes stack space consumption but not computation cost. Should the cost model distinguish a "stack depth" dimension? Current decision: no — TCO is a codegen optimization that does not affect cost analysis. +1. **Memoization and cost.** Should the compiler auto-detect memoizable recursion and adjust cost (e.g., `fibonacci` from O(2ⁿ) to O(n))? Current leaning: no auto-memoization, but the compiler suggests it in warnings. + +2. **Cost drift detection.** The mechanism, tolerance threshold, and CI integration are specified in the "Cost drift detection" section above. The remaining unresolved question is the design of a full runtime cost sampling framework — specifically, how to keep instrumentation overhead below 1% in production builds. -2. **Memoization and cost.** Should the compiler auto-detect memoizable recursion and adjust cost (e.g., `fibonacci` from O(2ⁿ) to O(n))? Current leaning: no auto-memoization, but the compiler suggests it in warnings. +3. **Probabilistic cost bounds.** Randomized algorithms (e.g., QuickSort with random pivot) have expected rather than worst-case cost. Should Spore support an **`expected`** cost metadata channel alongside worst-case four-slot bounds? Deferred to future work. -3. **Cost drift detection.** The mechanism, tolerance threshold, and CI integration are specified in the "Cost drift detection" section above. The remaining unresolved question is the design of a full runtime cost sampling framework — specifically, how to keep instrumentation overhead below 1% in production builds. +4. **Recursion depth limits.** Should the compiler enforce a compile-time recursion depth ceiling? Current decision: only `@unbounded` functions use runtime `with_cost_limit`. Whether to add a compile-time depth annotation (e.g., `max_depth ≤ 1000`) is unresolved. -4. **Probabilistic cost bounds.** Randomized algorithms (e.g., QuickSort with random pivot) have expected rather than worst-case cost. Should Spore support an **`expected`** cost metadata channel alongside worst-case four-slot bounds? Deferred to future work. +5. **Interaction with concurrency.** The parallel dimension `P` (lane) is defined, but runtime budget behavior across `parallel_scope` / `spawn` boundaries needs further specification. In particular, how does `with_cost_limit` behave across child tasks? -5. **Polymorphic cost (partially resolved).** The call-site instantiation approach is now specified in §4.17: `cost(f)` is substituted at each call site where `f` is concrete. Signature-level patterns such as `cost [N * cost(f) + N, N, 0, 0]` are first-class CostExprs when `N: Index`. +6. **Amortized analysis.** Operations like dynamic array append are O(1) amortized but O(n) worst-case. The current system can only express worst-case bounds. Whether to extend CostExpr with amortized semantics (potentially through a separate `amortized [...]` metadata channel parallel to worst-case `cost [...]`) is deferred. -6. **Recursion depth limits.** Should the compiler enforce a compile-time recursion depth ceiling? Current decision: only `@unbounded` functions use runtime `with_cost_limit`. Whether to add a compile-time depth annotation (e.g., `max_depth ≤ 1000`) is unresolved. +7. **Standard library cost annotations.** The standard library must be annotated with cost bounds for the system to be useful. What is the process for auditing and annotating existing library functions? Should the compiler ship with a built-in cost database for the standard library? -7. **Interaction with concurrency.** The parallel dimension `P` (lane) is defined but its interaction with async/await and structured concurrency needs further specification. In particular, how does `with_cost_limit` behave across spawn boundaries? +8. **max/min nesting depth limit.** The current default is 8 levels. Is this sufficient for all practical use cases? Should the limit be configurable, and what is the impact on compilation time when it is raised? -8. **Amortized analysis.** Operations like dynamic array append are O(1) amortized but O(n) worst-case. The current system can only express worst-case bounds. Whether to extend CostExpr with amortized semantics (potentially through a separate `amortized [...]` metadata channel parallel to worst-case `cost [...]`) is deferred. +### Resolved questions -9. **Standard library cost annotations.** The standard library must be annotated with cost bounds for the system to be useful. What is the process for auditing and annotating existing library functions? Should the compiler ship with a built-in cost database for the standard library? +1. **Tail-call optimization and cost.** TCO changes stack space consumption but + not the four declared cost dimensions. The cost model does not add a separate + stack-depth dimension; TCO remains a codegen/runtime optimization. -10. **max/min nesting depth limit.** The current default is 8 levels. Is this sufficient for all practical use cases? Should the limit be configurable, and what is the impact on compilation time when it is raised? +2. **Polymorphic cost.** The call-site instantiation approach is specified in + §4.17: `cost(f)` is substituted at each call site where `f` is concrete. + Signature-level patterns such as `cost [N * cost(f) + N, N, 0, 0]` are + first-class CostExprs when `N: Index`. diff --git a/seps/SEP-0005-hole-system.md b/seps/SEP-0005-hole-system.md index 29a5a2d..2a3dc1d 100644 --- a/seps/SEP-0005-hole-system.md +++ b/seps/SEP-0005-hole-system.md @@ -7,6 +7,7 @@ authors: - Zhan Rongrui created: 2026-03-31 requires: + - 1 - 2 - 3 - 4 @@ -17,20 +18,20 @@ superseded_by: null # SEP-0005: Hole System & Agent Protocol -> **Executive Summary**: Defines typed holes (`?name`) as first-class language constructs that carry type, effect, and cost context. The current stable machine surface is a shared typed hole protocol: `sporec holes FILE --json` emits a root object with `holes` and `dependency_graph`, and `sporec query-hole FILE ?name --json` returns the same per-hole object directly. This SEP keeps HoleReport on the current `v0.x` lineage, documents additive target extensions such as richer effect context, residual context, and rejection reasons, and preserves the richer long-term agent/watch protocol — dependency-aware fill ordering, cross-hole coordination, and the `DISCOVER → ANALYZE → PROPOSE → VERIFY → ACCEPT/REJECT` workflow — while documenting that today's `spore watch --json` still emits `compile_result` plus a summary-style `hole_graph_update`, not the full target graph payload. +> **Executive Summary**: Defines typed holes (`?name`) as first-class language constructs that carry type, effect, and cost context. The machine surface is a shared typed hole protocol: `sporec holes FILE --json` emits a root object with `holes` and `dependency_graph`, and `sporec query-hole FILE ?name --json` returns the same per-hole object directly. This SEP defines the HoleReport schema, the Hole Dependency Graph with layered topological sort and parallel fill scheduling, and the Agent state machine protocol (DISCOVER → ANALYZE → PROPOSE → VERIFY → ACCEPT/REJECT) for autonomous hole filling workflows. ## Summary This SEP specifies Spore's **Hole System** — a first-class language mechanism that treats unfinished code as a structured, typed, compiler-mediated collaboration interface between humans and AI Agents. -A *hole* is written `?name` (optionally `?name: Type`) in any expression position within a function body. The compiler accepts programs containing holes, classifies those functions as **partial**, and produces a shared typed hole report. The stable machine protocol reuses one hole object across both batch and single-hole queries: `sporec holes FILE --json` emits `{ "holes": [...], "dependency_graph": ... }`, while `sporec query-hole FILE ?name --json` returns the matching hole object directly. That shared object now carries the fields agents actually consume today, including `name`, `display_name`, `location`, `expected_type`, `type_inferred_from`, `function`, `enclosing_signature`, `bindings`, `binding_dependencies`, `available_effects`, `errors_to_handle`, a legacy `cost_budget`, `candidates`, `dependent_holes`, `confidence`, and `error_clusters`. That `cost_budget` field name remains on the current stable surface, but today it is still a scalar-style compatibility snapshot rather than a fully checked 4D residual context. +A _hole_ is written `?name` (optionally `?name: Type`) in any expression position within a function body. The compiler accepts programs containing holes, classifies those functions as **partial**, and produces a shared typed hole report. The machine protocol uses one hole object across both batch and single-hole queries: `sporec holes FILE --json` emits `{ "holes": [...], "dependency_graph": ... }`, while `sporec query-hole FILE ?name --json` returns the matching hole object directly. Each hole object carries: `name`, `display_name`, `location`, `expected_type`, `type_inferred_from`, `function`, `enclosing_signature`, `bindings`, `binding_dependencies`, `available_effects`, `errors_to_handle`, `cost_budget`, `candidates`, `dependent_holes`, `confidence`, and `error_clusters`. The `cost_budget` field holds a scalar-style summary; a future extension may add checked 4D residual context. -Multiple holes still form a **Hole Dependency Graph** (DAG), enabling topological ordering and parallel filling by multiple Agents. The long-term **Agent Protocol** defines a five-state machine (DISCOVER → ANALYZE → PROPOSE → VERIFY → ACCEPT/REJECT) for autonomous filling workflows. Today, however, `spore watch --json` remains a thinner transport: it emits per-cycle `compile_result` plus a summary `hole_graph_update`, while richer per-hole watch events remain the target architecture rather than the stable contract. +Multiple holes form a **Hole Dependency Graph** (DAG), enabling topological ordering and parallel filling by multiple Agents. The **Agent Protocol** defines a five-state machine (DISCOVER → ANALYZE → PROPOSE → VERIFY → ACCEPT/REJECT) for autonomous filling workflows. Key components formalized in this SEP: - **Hole syntax and semantics** (`?name`, `?name: Type`, partial functions) -- **HoleReport v0.x lineage**, with v0.3 target extensions for: (A) candidate scoring vector, (B) binding dependency graph, (C) confidence & ambiguity, (D) error clusters +- **HoleReport schema**, with extensions for: (A) candidate scoring vector, (B) binding dependency graph, (C) confidence & ambiguity, (D) error clusters - **Hole Dependency Graph** with layered topological sort and parallel fill scheduling - **Agent state machine protocol** for autonomous hole filling - **JSON output protocol** via `--json` flag and NDJSON event stream @@ -53,16 +54,16 @@ This creates two problems: Making holes a first-class language construct enables: -- **Compiler-mediated collaboration**: The compiler produces HoleReports that are *information-self-sufficient* — an Agent reading a report needs zero additional context to attempt a fill. +- **Compiler-mediated collaboration**: The compiler produces HoleReports that are _information-self-sufficient_ — an Agent reading a report needs zero additional context to attempt a fill. - **Incremental development**: Functions transition smoothly from `partial` to `complete`. Downstream callers are not invalidated because holes are body-only — they never affect signature hashes. - **Dependency-ordered filling**: The compiler can analyze data-flow between holes, build a DAG, and recommend an optimal filling order. -- **Cost-bounded filling (target behavior)**: each hole should eventually carry the checked residual context inherited from the enclosing function's declared budget; today the stable payload still exposes only a legacy `cost_budget` snapshot, and the compiler does not yet compute authoritative per-hole 4D residuals for candidate scoring. +- **Cost-bounded filling**: each hole carries a `cost_budget` inherited from the enclosing function's declared budget. A future extension may add checked 4D residual context. ### Why the Agent Protocol Matters -AI Agents are not humans reading error messages. They are stateless processes that parse structured output. Spore's hole system is designed with Agents as a *primary* consumer: +AI Agents are not humans reading error messages. They are stateless processes that parse structured output. Spore's hole system is designed with Agents as a _primary_ consumer: -- **HoleReport v0.x** keeps evolving by additive fields rather than by a major naming reset; the current target v0.3 additions replace human-readable strings (e.g., `match_quality: "partial"`) with machine-comparable scoring vectors. +- **HoleReport** keeps evolving by additive fields rather than by a detached schema-name reset; the current target additions replace human-readable strings (e.g., `match_quality: "partial"`) with machine-comparable scoring vectors. - **Binding dependency graphs** let Agents understand data-flow without re-analyzing source code. - **Confidence indicators** tell Agents when to auto-fill vs. when to request human guidance. - **NDJSON event streams** allow Agents to consume compiler output in real-time, reacting to each incremental compilation result. @@ -134,12 +135,12 @@ The compiler infers that `?nav_items` must have the type `render_nav` expects as A function containing at least one hole is **partial**. Partial functions: -| Property | Complete | Partial | -|---|---|---| -| Can be compiled | ✓ | ✓ | -| Can be called at runtime | ✓ | ✗ | -| Can be simulated | ✓ | ✓ | -| Appears in module exports | ✓ | ✓ (marked `partial`) | +| Property | Complete | Partial | +| ------------------------------ | -------------------- | ---------------------- | +| Can be compiled | ✓ | ✓ | +| Can be called at runtime | ✓ | ✗ | +| Can be simulated | ✓ | ✓ | +| Appears in module exports | ✓ | ✓ (marked `partial`) | | Signature hash changes on fill | if signature changes | never (holes are body) | Partiality is transitive: a caller of a partial function is itself partial. @@ -227,7 +228,7 @@ Empty function bodies desugar to a single hole named `{function_name}_body`. ### HoleInfo Structure -The implementation (in `sporec-typeck`) represents a single hole as: +The compiler-internal representation of a single hole uses the following structure: ```rust pub struct HoleInfo { @@ -239,42 +240,35 @@ pub struct HoleInfo { } ``` -**HoleInfo vs HoleReport:** `HoleInfo` is the **compiler-internal** representation (Rust struct in `sporec-typeck`), while `HoleReport` is the **JSON output format** for machine/Agent consumption. `HoleInfo` is converted to `HoleReport` via `to_json()` with additional computed fields (scores, confidence, error clusters). They serve different purposes: HoleInfo for compiler passes, HoleReport for external tooling. +**HoleInfo vs HoleReport:** `HoleInfo` is the compiler-internal representation, while `HoleReport` is the JSON output format for machine/Agent consumption. `HoleInfo` is converted to `HoleReport` with additional computed fields (scores, confidence, error clusters). They serve different purposes: HoleInfo for compiler passes, HoleReport for external tooling. The batch `sporec holes FILE --json` response aggregates all holes in a module by returning a root object with `holes` and `dependency_graph`; each element of `holes` is the same per-hole `HoleReport` object returned directly by -`sporec query-hole FILE ?name --json`. Hand-rolled JSON serialization is used -in v0.1 to minimize dependencies; serde migration is tracked as future work -(see Unresolved Questions §10). - -### HoleReport v0.3 (within the v0.x lineage) - -HoleReport v0.3 is a **superset** of v0.2 on the same `v0.x` line. The -project should not rename this family to a detached major-version scheme just -because additive fields land. Current implementation payloads are effectively -unversioned shared objects; if/when an explicit schema tag is emitted, it -should remain on a `spore/hole-report/v0.x` identifier. All v0.2 fields are -preserved; four new extensions are added. - -#### Base Fields (v0.2) - -| Field | Description | -|---|---| -| `hole.name` | Developer-assigned name | -| `hole.location` | Source location (file, line, column) | -| `hole.dependencies` | Names of upstream holes whose output feeds into this hole's context | -| `type.expected` | The type this hole must produce, including error variants | -| `type.inferred_from` | Human-readable explanation of why this type is expected | -| `bindings` | Variables in scope with name, type, and simulated value (`symbolic` or `computed`) | -| `available_effects` | The `uses` list available at the hole site | -| `errors_to_handle` | Error types not yet handled before the hole | -| `cost_budget.budget_total` | Legacy scalar-style compatibility field carried by today's stable payload | -| `cost_budget.cost_before_hole` | Legacy prefix-cost snapshot; useful as schema context, not yet an authoritative checked residual | -| `cost_budget.budget_remaining` | Legacy derived remainder; current implementations should not treat it as a precise 4D residual proof | -| `candidates` | Functions in scope whose return type matches the hole's expected type | -| `dependent_holes` | Holes that become reachable when this hole is filled | -| `enclosing_function` | Full signature context of the containing function | +`sporec query-hole FILE ?name --json`. + +### HoleReport schema + +HoleReport evolves by additive fields. The base fields are preserved; four extension groups are documented below. + +#### Base Fields + +| Field | Description | +| ------------------------------ | -------------------------------------------------------------------------------------------- | +| `hole.name` | Developer-assigned name | +| `hole.location` | Source location (file, line, column) | +| `hole.dependencies` | Names of upstream holes whose output feeds into this hole's context | +| `type.expected` | The type this hole must produce, including error variants | +| `type.inferred_from` | Human-readable explanation of why this type is expected | +| `bindings` | Variables in scope with name, type, and simulated value (`symbolic` or `computed`) | +| `available_effects` | The `uses` list available at the hole site | +| `errors_to_handle` | Error types not yet handled before the hole | +| `cost_budget.budget_total` | Scalar-style summary of the total declared budget | +| `cost_budget.cost_before_hole` | Cost accumulated before this hole (prefix snapshot) | +| `cost_budget.budget_remaining` | Estimated remaining budget (derived scalar; a future extension may add checked 4D residuals) | +| `candidates` | Functions in scope whose return type matches the hole's expected type | +| `dependent_holes` | Holes that become reachable when this hole is filled | +| `enclosing_function` | Full signature context of the containing function | #### Hole Type Inference Rule @@ -288,7 +282,7 @@ A hole's type is determined by the **intersection of all constraints** imposed b When multiple constraints agree, the intersection is the agreed-upon type. When constraints conflict, the compiler applies the **nearest constraint rule** (see Edge Cases §8.3) and emits a warning. -When context provides **no constraints**, the hole is *unconstrained* — reported with type `_`. The HoleReport lists available bindings so the Agent can propose a type. +When context provides **no constraints**, the hole is _unconstrained_ — reported with type `_`. The HoleReport lists available bindings so the Agent can propose a type. #### Hole as Match Scrutinee @@ -310,12 +304,12 @@ The compiler infers: `?parsed_data` must have a type with at least variants `Val Replaces the coarse `match_quality: "exact" | "partial"` string with a four-dimensional numeric vector: -| Dimension | Field | Range | Meaning | -|---|---|---|---| -| Type match | `type_match` | `[0, 1]` | Return type + parameter type match degree | -| Cost fit | `cost_fit` | `[0, 1]` | Estimated cost vs. remaining budget fit | -| Effect fit | `required_effects_fit` | `{0, 1}` | All required effects available (boolean) | -| Error coverage | `error_coverage` | `[0, 1]` | Fraction of candidate's declared errors covered by context | +| Dimension | Field | Range | Meaning | +| -------------- | ---------------------- | -------- | ---------------------------------------------------------- | +| Type match | `type_match` | `[0, 1]` | Return type + parameter type match degree | +| Cost fit | `cost_fit` | `[0, 1]` | Estimated cost vs. remaining budget fit | +| Effect fit | `required_effects_fit` | `{0, 1}` | All required effects available (boolean) | +| Error coverage | `error_coverage` | `[0, 1]` | Fraction of candidate's declared errors covered by context | This scoring vector is still a **target** contract. In particular, `cost_fit` should ultimately be computed against checked residual context, but current @@ -355,22 +349,19 @@ Each candidate also includes an `adjustments` array of human-readable notes (e.g #### Prospective additive extensions (not yet stable output) -The next HoleReport additions should remain on the same `v0.x` lineage and are -expected to be **additive** rather than schema-breaking: +The following extensions to the HoleReport schema are defined for future adoption: 1. **`effect_context`** — active handler stack, already-discharged effects, and the visible effect aliases / interfaces at the hole site. 2. **`residual_context`** — remaining checked obligations after the current prefix, including a 4D cost vector (`budget_declared`, `cost_before`, - `budget_residual`) plus any still-unhandled effect / error obligations. This - stays on the `v0.x` line and is the planned successor to the legacy - scalar-style `cost_budget` snapshot once the compiler computes real residuals. + `budget_residual`) plus any still-unhandled effect / error obligations. 3. **`rejection_reasons`** — structured VERIFY/REJECT feedback explaining why a proposed fill failed (for example: `type_mismatch`, `effect_leak`, `budget_exceeded`, `duplicate_handler_match`). -These fields are target behavior for a future v0.x slice. Today's stable output -remains the shared hole object described above. +When adopted, these fields should be added to the existing schema as additive +fields, preserving backward compatibility. #### Extension B: Binding Dependency Graph @@ -425,22 +416,22 @@ Groups errors by their source operation with handling suggestions: Suggestion generation rules: -| Pattern | Suggestion | -|---|---| -| Single error, propagable | `"early return with ?"` | -| Multiple errors, same source | `"match on error type"` | -| Retryable (contains `Timeout`/`Retry`) | `"retry with backoff"` | +| Pattern | Suggestion | +| -------------------------------------- | ----------------------- | +| Single error, propagable | `"early return with ?"` | +| Multiple errors, same source | `"match on error type"` | +| Retryable (contains `Timeout`/`Retry`) | `"retry with backoff"` | | Error in enclosing function's `!` list | `"propagate to caller"` | #### Error System Integration (Three-Field Model) The enclosing function's declared error list (`! Err1 | Err2 | Err3`) is partitioned into three categories at each hole site: -| Field | Meaning | -|---|---| -| `errors_to_handle` | Errors not yet handled by code before the hole. The filling should handle or propagate these. | +| Field | Meaning | +| ------------------------ | --------------------------------------------------------------------------------------------------------------------------------------- | +| `errors_to_handle` | Errors not yet handled by code before the hole. The filling should handle or propagate these. | | `errors_already_handled` | Errors that were caught/handled by code before the hole (e.g., via `catch` or `match`). The filling does not need to worry about these. | -| `errors_passthrough` | Errors that can propagate upward to the caller. The filling may also propagate them but does not need explicit handling. | +| `errors_passthrough` | Errors that can propagate upward to the caller. The filling may also propagate them but does not need explicit handling. | Example: @@ -480,11 +471,11 @@ Given a module M, the Hole Dependency Graph G = (V, E) is: Edges are classified into three dependency types: -| Type | Notation | Meaning | -|---|---|---| -| Type dependency | `type` | h₂'s expected type contains a type variable solvable only after h₁ is filled | -| Value dependency | `value` | h₂'s available bindings include a value whose data-flow traces back to h₁ | -| Cost dependency | `cost` | h₂'s target checked residual context depends on h₁'s actual cost | +| Type | Notation | Meaning | +| ---------------- | -------- | ---------------------------------------------------------------------------- | +| Type dependency | `type` | h₂'s expected type contains a type variable solvable only after h₁ is filled | +| Value dependency | `value` | h₂'s available bindings include a value whose data-flow traces back to h₁ | +| Cost dependency | `cost` | h₂'s target checked residual context depends on h₁'s actual cost | #### Graph Construction @@ -612,11 +603,11 @@ function compute_fill_order(G: Graph) -> Result[List[Set[Hole]], CycleError]: **Proof.** By induction on layer index k. -**Base case (k = 0):** L₀ = { h ∈ V | in-degree(h) = 0 }. These holes have no predecessors — their types, bindings, and any target residual dependencies are fully determined. They are fillable. In current implementations the exposed `cost_budget` payload may still be a compatibility placeholder, but the dependency statement here is about the target checked-residual model. Since we take *all* zero in-degree nodes, no fillable hole is missed. +**Base case (k = 0):** L₀ = { h ∈ V | in-degree(h) = 0 }. These holes have no predecessors — their types, bindings, and any cost dependencies are fully determined. They are fillable. Since we take _all_ zero in-degree nodes, no fillable hole is missed. **Inductive step (k → k+1):** Assume layers L₀ through Lₖ are correctly computed and all holes in them are filled. Let G' be the subgraph remaining after removing L₀ ∪ ... ∪ Lₖ. -Lₖ₊₁ = { h ∈ V(G') | in-degree_{G'}(h) = 0 } +Lₖ₊₁ = { h ∈ V(G') | in-degree\_{G'}(h) = 0 } For any h ∈ Lₖ₊₁: all predecessors of h in G lie in L₀ ∪ ... ∪ Lₖ (already filled), so h's constraints are fully resolved — h is fillable. @@ -630,14 +621,14 @@ Therefore Lₖ₊₁ is exactly the set of newly fillable holes at round k+1. #### Complexity Analysis -| Operation | Time Complexity | Notes | -|-----------|----------------|-------| -| Graph construction | O(\|V\| × B) | B = average bindings per hole | -| Topological sort (layered) | O(\|V\| + \|E\|) | Kahn's algorithm variant | -| Cycle detection | O(\|V\| + \|E\|) | DFS coloring (by-product of topo sort) | +| Operation | Time Complexity | Notes | +| -------------------------------- | ---------------- | -------------------------------------------- | +| Graph construction | O(\|V\| × B) | B = average bindings per hole | +| Topological sort (layered) | O(\|V\| + \|E\|) | Kahn's algorithm variant | +| Cycle detection | O(\|V\| + \|E\|) | DFS coloring (by-product of topo sort) | | Incremental update (single fill) | O(\|neighbors\|) | Only the filled hole's neighbors are touched | -| Parallel scheduling | O(\|V\|) | Traverse in-degree array to find ready set | -| JSON serialization | O(\|V\| + \|E\|) | Linear scan of graph structure | +| Parallel scheduling | O(\|V\|) | Traverse in-degree array to find ready set | +| JSON serialization | O(\|V\| + \|E\|) | Linear scan of graph structure | **Space complexity:** O(\|V\| + \|E\|) for the graph structure and in-degree table. @@ -724,7 +715,7 @@ The Agent protocol defines a five-state machine with no explicit RETRY state: ▼ │ ┌──────────┐ │ │ ANALYZE │── sporec query-hole FILE ?name --json │ - └────┬─────┘ receive full HoleReport v0.3 │ + └────┬─────┘ receive full HoleReport │ │ │ │ generate fill code │ ▼ │ @@ -752,17 +743,7 @@ The Agent protocol defines a five-state machine with no explicit RETRY state: └── no → COMMIT (all holes filled) ``` -> **Current implementation status (2026-04):** -> -> - `sporec holes FILE --json` is the stable batch discovery surface and emits a -> root object with `holes` plus `dependency_graph`. -> - `sporec query-hole FILE ?name --json` returns one hole object with the same -> shared fields as each entry in the batch report. -> - `spore watch --json` already emits NDJSON `compile_result` events and a -> summary-style `hole_graph_update`, but it does **not** yet stream the full -> dependency graph or per-hole `hole_update` payloads. - -**DISCOVER**: In the current stable workflow, the Agent discovers work either by reading `sporec holes FILE --json` or by listening to `spore watch --json` for `compile_result` plus summary `hole_graph_update` events. The long-term target is for watch mode to carry the full dependency graph, `ready_to_fill` set, and richer per-hole updates directly. +**DISCOVER**: The Agent discovers work via `sporec holes FILE --json` (batch) or `spore watch --json` (event stream). Both surfaces expose hole metadata and dependency information. **ANALYZE**: For each selected hole, the Agent requests its HoleReport via `sporec query-hole FILE ?name --json`. It examines: @@ -776,7 +757,7 @@ The Agent protocol defines a five-state machine with no explicit RETRY state: **PROPOSE**: The Agent writes fill code into the source file, replacing `?name`. This is an atomic operation — one hole per write. The Agent must only reference bindings visible at the hole site (as listed in `bindings`). -**VERIFY**: The `spore watch` process detects the file change, triggers incremental compilation, and currently emits a `compile_result` event such as: +**VERIFY**: The `spore watch` process detects the file change, triggers incremental compilation, and emits a `compile_result` event: ```json { @@ -788,11 +769,9 @@ The Agent protocol defines a five-state machine with no explicit RETRY state: } ``` -Current watch output is compile-result oriented; richer transport states such as `accepted`, `rejected`, and `conflict` remain part of the long-term protocol vocabulary rather than the stable watch payload. - **ACCEPT**: Compilation succeeded. The hole is marked `filled`. The dependency graph is recalculated, possibly unlocking blocked holes. The Agent returns to DISCOVER. -**REJECT**: Compilation failed. Today that appears as a watch `compile_result` with `status: "error"` plus compiler diagnostics; the richer structured rejection payload below remains the target transport shape. A future additive `rejection_reasons` field should capture the normalized machine causes without changing the current v0.x lineage: +**REJECT**: Compilation failed. The watch event carries compiler diagnostics. Structured `rejection_reasons` (defined in the Prospective extensions section above) provide normalized machine-readable failure causes. A sample rejection payload: ```json { @@ -936,8 +915,19 @@ The HoleReport for `?process_with_config` includes both the closure parameter `d ```json { "bindings": [ - { "name": "data", "type": "Data", "simulated_value": { "kind": "symbolic", "origin": "closure parameter" } }, - { "name": "config", "type": "Config", "simulated_value": { "kind": "symbolic", "origin": "captured from make_processor" } } + { + "name": "data", + "type": "Data", + "simulated_value": { "kind": "symbolic", "origin": "closure parameter" } + }, + { + "name": "config", + "type": "Config", + "simulated_value": { + "kind": "symbolic", + "origin": "captured from make_processor" + } + } ] } ``` @@ -1011,7 +1001,7 @@ This is the central section of SEP-0005. The hole system is designed with AI Age ### Information Self-Sufficiency -A HoleReport on the current v0.x lineage is **self-contained**. An Agent +A HoleReport object is **self-contained**. An Agent reading a report needs zero additional context to attempt a fill. The report includes: @@ -1060,7 +1050,7 @@ In the **long-term richer protocol**, multiple Agents can fill independent holes 4. On ACCEPT, a new `hole_graph_update` unlocks blocked holes 5. Agents re-enter DISCOVER to claim newly available holes -Today, the stable implementation approximates this by combining the shared batch hole report (`sporec holes FILE --json`) with summary watch events. +Today, the stable implementation approximates this by combining the batch hole report (`sporec holes FILE --json`) with summary watch events. **Conflict handling**: If two Agents attempt the same hole, the first writer wins (file-level lock). The second receives a `CONFLICT` signal and selects another hole: @@ -1090,23 +1080,23 @@ If an Agent fails repeatedly on a hole, it should: ### Real-Time Event Stream -The **current stable** NDJSON contract from `spore watch --json` is intentionally summary-oriented: +`spore watch --json` emits newline-delimited JSON events: ```text {"event":"compile_result","file":"/tmp/spore-step9-watch.sp","status":"ok","errors":[],"timestamp":1775999403} {"event":"hole_graph_update","holes_total":1,"filled_this_cycle":0,"ready_to_fill":1,"blocked":0} ``` -Current event types: +Event types: -| Event | Trigger | Data | -|---|---|---| -| `compile_result` | Each incremental compilation cycle | File path, status (`ok`/`error`), diagnostics payload, timestamp | +| Event | Trigger | Data | +| ------------------- | ----------------------------------------------- | ---------------------------------------------------------------------------------- | +| `compile_result` | Each incremental compilation cycle | File path, status (`ok`/`error`), diagnostics payload, timestamp | | `hole_graph_update` | After a compile cycle that still contains holes | Hole-count summary: `holes_total`, `filled_this_cycle`, `ready_to_fill`, `blocked` | -This stream is already enough for a practical DISCOVER/VERIFY loop: agents can watch compile success or failure and use the hole summary to decide when to re-run `sporec holes FILE --json` or inspect a specific hole with `sporec query-hole FILE ?name --json`. +This stream supports a practical DISCOVER/VERIFY loop: agents can watch compile success or failure and use the hole summary to decide when to re-run `sporec holes FILE --json` or inspect a specific hole with `sporec query-hole FILE ?name --json`. -A **future richer transport** may add full dependency-graph payloads and per-hole update events, but those are not part of the current stable watch contract. When that richer transport lands, it should extend the shared hole/diagnostic model rather than inventing a separate schema. +A future extension of this transport may add full dependency-graph payloads and per-hole update events. When that richer transport is adopted, it should extend the shared hole/diagnostic model rather than inventing a separate schema. ```pseudocode while line = read_line(stdin): @@ -1168,12 +1158,9 @@ t13 COMMIT ── all holes filled ──────────── All hole-related commands support `--json` for machine consumption. -**`sporec holes FILE --json`:** emits the batch hole report. The stable root object contains `holes` plus `dependency_graph`. +**`sporec holes FILE --json`:** emits the batch hole report. The root object contains `holes` plus `dependency_graph`. -The JSON example below intentionally shows today's stable `cost_budget` field -name. Its numeric contents are illustrative compatibility data, not a claim that -current releases already emit authoritative checked 4D residual vectors; that is -the planned `residual_context` extension on the same `v0.x` lineage. +The JSON example below shows the `cost_budget` field as a scalar summary. A future `residual_context` extension (see Prospective extensions) may add checked 4D residual vectors. ```json { @@ -1258,7 +1245,12 @@ The `dependency_graph` embedded in the batch hole report is the authoritative ma { "from": "?process_order", "to": "?send_receipt", "kind": "type" } ], "roots": ["?validate_input", "?check_auth"], - "suggested_order": ["?validate_input", "?check_auth", "?process_order", "?send_receipt"] + "suggested_order": [ + "?validate_input", + "?check_auth", + "?process_order", + "?send_receipt" + ] } ``` @@ -1278,15 +1270,15 @@ The hole system integrates with Language Server Protocol: ### Hole-specific diagnostics -| Code | Severity | Message | -|---|---|---| -| `H0001` | Info | `hole ?name: expected type T` — standard hole report | -| `H0002` | Warning | `hole-type-conflict: annotation T1 conflicts with inferred T2` | -| `H0003` | Info | `partial function F depends on partial function G` | -| `H0101` | Error | `duplicate hole name ?name in module M` | -| `H0201` | Error | `hole ?name in signature position (not allowed)` | -| `H0301` | Error | `circular hole dependency: ?A → ?B → ?A` | -| `H0401` | Info | `filling of ?name revealed new hole ?other` | +| Code | Severity | Message | +| ------- | -------- | -------------------------------------------------------------- | +| `H0001` | Info | `hole ?name: expected type T` — standard hole report | +| `H0002` | Warning | `hole-type-conflict: annotation T1 conflicts with inferred T2` | +| `H0003` | Info | `partial function F depends on partial function G` | +| `H0101` | Error | `duplicate hole name ?name in module M` | +| `H0201` | Error | `hole ?name in signature position (not allowed)` | +| `H0301` | Error | `circular hole dependency: ?A → ?B → ?A` | +| `H0401` | Info | `filling of ?name revealed new hole ?other` | ### Cost diagnostics for partial functions @@ -1314,7 +1306,7 @@ On failed fill: full diagnostic with `root_cause`, `fix_hints`, and `suggestion` 4. **Scoring weight rigidity**: The hard-coded weights (0.40, 0.20, 0.25, 0.15) may not be optimal for all projects. However, making them configurable increases cognitive burden, and the expected variance across projects is low. -5. **Hand-rolled JSON serialization**: The current implementation hand-rolls JSON serialization for v0.1 to minimize dependencies. This is intentionally simple but acknowledged as fragile for complex nested structures. serde migration is tracked as future work (see Unresolved Questions §10). +5. **Serialization fragility**: The serialization format is intentionally simple. A more structured schema-driven encoder may be adopted in the future without changing the external payload contract. 6. **Parallel fill coordination overhead**: File-level locking for multi-Agent fills adds synchronization cost. For small projects this overhead dominates the benefit. @@ -1328,7 +1320,7 @@ Rejected. Anonymous holes create ambiguity in CLI queries (`sporec query-hole FI ### Alternative 2: Holes Affect Signature Hashes -Rejected. If holes changed the signature hash, downstream dependents would need recompilation every time a hole is filled — even though the *contract* never changed. This breaks snapshot stability during development. +Rejected. If holes changed the signature hash, downstream dependents would need recompilation every time a hole is filled — even though the _contract_ never changed. This breaks snapshot stability during development. ### Alternative 3: Explicit Priority Annotations on Holes @@ -1391,18 +1383,18 @@ These are runtime markers with no compiler support. They provide no type informa ## Backward compatibility and migration -### Schema Versioning +### Schema Evolution -- HoleReport stays on the current `v0.x` lineage; additive extensions must not - force a detached `v3` naming story -- Current implementation payloads are shared JSON objects without an explicit - schema tag; any future schema identifier should stay in the - `spore/hole-report/v0.x` family -- All v0.2 fields are preserved with identical semantics -- New v0.3 fields (`binding_dependencies`, `confidence`, `error_clusters`, - `candidates[].scores`, `candidates[].overall`, `candidates[].adjustments`) - are additive -- Tools that do not recognize newer v0.x fields can safely ignore them +- HoleReport stays on the current schema; additive extensions + must not force a detached naming story. +- HoleReport objects are JSON objects without an explicit + schema tag; any future schema identifier should avoid embedding a + design revision label. +- Existing fields are preserved with identical semantics. +- New fields such as `binding_dependencies`, `confidence`, `error_clusters`, + `candidates[].scores`, `candidates[].overall`, and + `candidates[].adjustments` are additive. +- Tools that do not recognize newer fields can safely ignore them. ### CLI Flags @@ -1420,12 +1412,14 @@ These are runtime markers with no compiler support. They provide no type informa 1. **Existing complete code**: Unaffected. No holes means no HoleReports, no dependency graphs, no Agent protocol activation. 2. **New code with holes**: Opt-in by writing `?name` in function bodies. -3. **Agent tooling**: Agents should treat the payload as a `v0.x`-lineage schema and ignore unknown additive fields. If an explicit schema identifier is emitted, it should stay in the `"spore/hole-report/v0.x"` family. +3. **Agent tooling**: Agents should treat the payload as an additive shared schema and ignore unknown additive fields. If an explicit schema identifier is emitted, it should avoid embedding a design revision label. --- ## Unresolved questions +### Design questions + 1. **Cross-module hole dependencies**: The current dependency graph is scoped to a single module. How should holes that depend on partial functions in other modules be represented? The `partial` marker propagates, but the graph does not yet span modules. 2. **Type holes**: This SEP covers value holes (`?name`). Type holes (`?T_Name` in uppercase positions) are mentioned but not fully specified. Should they be part of this SEP or a separate one? @@ -1434,14 +1428,18 @@ These are runtime markers with no compiler support. They provide no type informa 4. **Agent identity and coordination**: The multi-Agent protocol uses file-level locking. Should Agents have explicit identities? Should there be a coordinator process, or is the decentralized claim-and-lock protocol sufficient at scale? -5. **Hole versioning**: When a hole is filled, rejected, and re-filled, should the system maintain a history of fill attempts? This would enable better Agent learning but increases storage. +5. **Hole attempt history**: When a hole is filled, rejected, and re-filled, should the system maintain a history of fill attempts? This would enable better Agent learning but increases storage. 6. **Partial function exports**: The current design marks partial functions in module exports. Should importers be able to depend on partial functions (receiving symbolic values), or should partial functions be hidden from external modules? -7. **DAG visualization**: The specification includes ASCII DAG output. Should a standard visual format (DOT/Graphviz, Mermaid) be part of the protocol? +7. **DAG visualization**: The specification includes ASCII DAG output. Should a standard visual format (DOT/Graphviz, Mermaid) be part of the protocol, or should visualization remain an editor/tooling concern outside the stable machine payload? 8. **Dynamic priority adjustment**: Should the filling order adapt based on Agent performance history (e.g., prioritize holes similar to ones the Agent has successfully filled before)? -9. **Cost dependency precision**: Cost dependencies currently assume sequential execution within a block. For branches, should cost dependencies be path-sensitive? +9. **Cost dependency precision**: Cost dependencies assume sequential execution within a block. For branches, should cost dependencies be path-sensitive? + +### Implementation follow-up -10. **serde migration**: The hand-rolled JSON serializer should eventually be replaced. When should this migration happen, and should it be a breaking change to the internal API? +The hand-rolled JSON serializer should eventually be replaced with serde or a +schema-driven encoder. That migration should preserve the external payload +contract unless a future SEP deliberately defines a breaking schema split. diff --git a/seps/SEP-0006-compiler-architecture.md b/seps/SEP-0006-compiler-architecture.md index 69c1ee0..ffe672a 100644 --- a/seps/SEP-0006-compiler-architecture.md +++ b/seps/SEP-0006-compiler-architecture.md @@ -31,15 +31,15 @@ representation. Five major passes (lex, parse, resolve+desugar, typecheck+capcheck+costcheck, codegen) transform source programs through these layers. -Incremental compilation is achieved through the **salsa** framework with a +Incremental compilation is achieved through demand-driven memoized computation with a two-tier content-addressed hashing strategy: **sig hash** (computed at the Resolve layer) controls downstream recompilation, and **impl hash** (computed at the TypeCheck layer) controls codegen caching. A **watch mode** provides real-time diagnostics via human-readable terminal output or machine-consumable NDJSON events. -The Proof-of-Concept (PoC) phase uses a tree-walking interpreter; Cranelift -codegen is deferred to the prototype phase. +An initial implementation may use a tree-walking interpreter; native code +generation is deferred to a later stage. --- @@ -49,7 +49,7 @@ A language that targets both human developers and AI agents as first-class users needs a compiler whose architecture is: 1. **Transparent** — every compilation stage produces an observable, queryable - representation so that diagnostics can pinpoint *exactly* what went wrong, + representation so that diagnostics can pinpoint _exactly_ what went wrong, where, and why. 2. **Incremental** — developers and agents iterate rapidly; the compiler must recompile only what changed, propagating invalidation no further than @@ -142,28 +142,20 @@ $ spore watch src/main.sp ### Error output modes -| Mode | Flag | Audience | Contract | -|------|------|----------|----------| -| **Default** | *(none)* | Developers | Human-readable projection of the canonical Diagnostic IR | +| Mode | Flag | Audience | Contract | +| ----------- | ----------- | ----------------------------------- | ---------------------------------------------------------------------------- | +| **Default** | _(none)_ | Developers | Human-readable projection of the canonical Diagnostic IR | | **Verbose** | `--verbose` | Developers debugging complex errors | Default projection plus optional analysis detail not required by the base IR | -| **JSON** | `--json` | CI/CD, scripts, LSP, agents | Machine-readable projection of the canonical Diagnostic IR | +| **JSON** | `--json` | CI/CD, scripts, LSP, agents | Machine-readable projection of the canonical Diagnostic IR | The architectural target is one shared Diagnostic IR with multiple projections. `Default` and `--json` are two renderings of the same diagnostics; `--verbose` adds extra analysis detail on top of that shared model rather than defining a separate protocol. -> **Current implementation status (2026-04):** -> -> - the shared diagnostics direction is no longer hypothetical: CLI JSON, LSP, -> and hole-reporting surfaces are being aligned as projections over the same -> compiler-facing model -> - `sporec holes FILE --json` and `sporec query-hole FILE ?name --json` already -> expose a shared typed-hole machine protocol -> - `spore watch --json` already emits machine-readable `compile_result` and -> `hole_graph_update` events -> - the remaining work is freezing the minimal canonical field set and layering -> richer watch / auto-fix transports without inventing parallel schemas +> The shared diagnostics direction is the design target: CLI JSON, LSP, +> and hole-reporting surfaces should be projections over the same +> compiler-facing model. --- @@ -222,12 +214,12 @@ Source Text (.sp files) ▼ ┌──────────────────────────────────────────────────────────────────────────┐ │ Pass 5: CODEGEN │ -│ TypedHIR → Cranelift IR → Native Code │ -│ ─ Cranelift IR serves as LIR (no separate low-level IR) │ +│ TypedHIR → Native Code │ +│ ─ Codegen backend serves as LIR (no separate low-level IR) │ │ ─ Function-level granularity (fits content-addressed caching) │ │ ─ Tail-call optimization (TCO) applied here │ │ ─ Pattern matching lowered to branches/jumps │ -│ ─ [PoC phase: tree-walking interpreter instead] │ +│ ─ Initial implementation may use tree-walking interpreter │ └──────────────────────────────────────────────────────────────────────────┘ │ ▼ @@ -252,25 +244,14 @@ The diagnostics architecture is split into three layers: This separation is intentional: -- `thiserror` belongs at crate boundaries and implementation-facing error enums -- `ariadne` belongs only to the human-rendering layer -- the core compiler should not depend on terminal-formatting concerns +- Implementation-facing error types belong at crate boundaries +- Terminal-formatting belongs only to the human-rendering layer +- The core compiler should not depend on terminal-formatting concerns - JSON and LSP should be derived from the same IR, not maintained as parallel ad hoc structures -**Current implementation status (2026-04):** - -- parser, type checking, and module loading already expose typed diagnostic - producers -- JSON, LSP, and hole-reporting are no longer separate greenfield designs; they - already have machine-readable surfaces and now need convergence rather than - reinvention -- some crate boundaries still use `Result<_, String>` or `Vec`, so the - migration is not finished - -The next diagnostics migration step should finish converging those remaining -stringly boundaries on a shared `sporec-diagnostics` crate without regressing the -working JSON/LSP/hole surfaces that already exist. +The diagnostics architecture aims for a shared `Diagnostic` IR crate that feeds +all renderers (human CLI, JSON, LSP, and future watch/event-stream transports). ### IR layers @@ -320,12 +301,12 @@ The Resolve pass performs: The Desugar pass (merged into Resolve) canonicalizes all syntactic sugar: -| Sugar | Desugared form | -|-------|---------------| -| `a \|> f(b)` | `f(a, b)` | -| `expr?` | `match expr { Ok(v) => v, Err(e) => return Err(e.into()) }` | -| `f"hello {name}"` | `InterpolatedStr([Literal("hello "), Expr(name)])` | -| `t"hello {name}"` | `Template([Literal("hello "), Interpolation(name)])` | +| Sugar | Desugared form | +| ----------------- | ----------------------------------------------------------- | +| `a \|> f(b)` | `f(a, b)` | +| `expr?` | `match expr { Ok(v) => v, Err(e) => return Err(e.into()) }` | +| `f"hello {name}"` | `InterpolatedStr([Literal("hello "), Expr(name)])` | +| `t"hello {name}"` | `Template([Literal("hello "), Interpolation(name)])` | **sig hash is computed at this layer.** It covers: @@ -344,17 +325,17 @@ the input to codegen. The unified TypeCheck pass performs: -| Sub-pass | Responsibility | -|----------|---------------| -| **Type inference** | Bidirectional: signatures fully annotated, function bodies inferred | -| **Trait resolution** | Trait/Effect resolution, associated types, GAT instantiation | -| **Const generics** | Evaluation of compile-time constant generic parameters | -| **Exhaustiveness** | Verify match expressions cover all variants | -| **Error sets** | Propagation and consistency of `! ErrorType` declarations | -| **Refinement types** | L0 decidable predicates + L1 abstract interpretation propagation | -| **CapCheck** | Function body effect usage ⊆ declared `uses` set; module ceiling check | -| **CostCheck** | Abstract interpretation of 4D cost vector: `compute(op) + alloc(cell) + io(call) + parallel(lane)`; verify ≤ declared bound | -| **Hole reports** | Generate full context: type, available bindings, effect set, cost budget, candidate functions | +| Sub-pass | Responsibility | +| -------------------- | --------------------------------------------------------------------------------------------------------------------------- | +| **Type inference** | Bidirectional: signatures fully annotated, function bodies inferred | +| **Trait resolution** | Trait/Effect resolution, associated types, GAT instantiation | +| **Const generics** | Evaluation of compile-time constant generic parameters | +| **Exhaustiveness** | Verify match expressions cover all variants | +| **Error sets** | Propagation and consistency of `! ErrorType` declarations | +| **Refinement types** | L0 decidable predicates + L1 abstract interpretation propagation | +| **CapCheck** | Function body effect usage ⊆ declared `uses` set; module ceiling check | +| **CostCheck** | Abstract interpretation of 4D cost vector: `compute(op) + alloc(cell) + io(call) + parallel(lane)`; verify ≤ declared bound | +| **Hole reports** | Generate full context: type, available bindings, effect set, cost budget, candidate functions | Effect checking is merged into TypeCheck because `Effect = Trait` in Spore — effect resolution shares the same infrastructure as trait resolution. @@ -375,7 +356,7 @@ There is **no separate low-level IR**. Cranelift IR serves as the LIR: - Pattern matching is lowered to branch/jump instructions during codegen - Tail-call optimization (TCO) is applied during the TypedHIR → Cranelift translation -- In the PoC phase, a tree-walking interpreter replaces Cranelift; the codegen +- An initial implementation may use a tree-walking interpreter; the codegen pass is a clean abstraction boundary ### Design rationale for IR layer count @@ -398,37 +379,16 @@ does not have. ### Incremental compilation -#### salsa integration - -Spore uses the [salsa](https://github.com/salsa-rs/salsa) framework for -demand-driven, memoized, incremental computation. Each compilation pass is a -salsa-tracked function: - -```rust -#[salsa::input] -struct SourceFile { - #[return_ref] - path: PathBuf, - #[return_ref] - contents: String, -} - -#[salsa::tracked] -fn lex(db: &dyn Db, file: SourceFile) -> TokenStream { ... } - -#[salsa::tracked] -fn parse(db: &dyn Db, tokens: TokenStream) -> Ast { ... } - -#[salsa::tracked] -fn resolve(db: &dyn Db, ast: Ast) -> Hir { ... } -// side product: sig_hash +#### Incremental compilation strategy -#[salsa::tracked] -fn type_check(db: &dyn Db, hir: Hir) -> TypedHir { ... } -// side product: impl_hash, hole_reports, diagnostics +The compiler uses demand-driven incremental computation. Each compilation pass is memoized as a tracked query with automatic invalidation on input changes. The pass signatures are: -#[salsa::tracked] -fn codegen(db: &dyn Db, typed_hir: TypedHir) -> CompiledModule { ... } +```text +lex(file) → TokenStream +parse(tokens) → Ast +resolve(ast) → Hir // side product: sig_hash +type_check(hir) → TypedHir // side product: impl_hash, hole_reports, diagnostics +codegen(typed_hir) → CompiledModule ``` #### Two-tier hash strategy @@ -462,25 +422,37 @@ The incremental compilation strategy relies on two content-addressed hashes: ``` **sig hash** covers: exported type/function signatures, effect requirements, -cost annotations. It does *not* cover: function bodies, private definitions, +cost annotations. It does _not_ cover: function bodies, private definitions, comments, internal hole states. **impl hash** covers: the fully type-checked module content. Partial functions (with holes) have `impl hash = None`. +**spec hash** covers: `spec` blocks attached to functions, trait method +signatures, and `impl` methods. It is separate from `sig hash` so behavioral +contracts can be edited without changing the callable API. It is also separate +from `impl hash` so test metadata can be tracked even when a body is still +partial. + +| Change | Signature hash | Spec hash | Required approval | +| -------------------------------------------- | -------------- | ------------------------------------ | ----------------- | +| Add a `spec` block where none existed | unchanged | changed | `--permit-spec` | +| Add, modify, or remove an `example` or `law` | unchanged | changed | `--permit-spec` | +| Change a callable signature clause | changed | unchanged unless `spec` also changed | `--permit` | + Incremental scenarios: -| Scenario | Example | What happens | -|----------|---------|-------------| -| **A: impl hash unchanged** | Only a comment changed | Skip entirely (<1ms) | -| **B: impl only** | Changed function body, same signature | Recompile this module, skip downstream | -| **C: sig changed** | Changed function parameter type | Recompile this module + cascade to direct dependents | +| Scenario | Example | What happens | +| -------------------------- | ------------------------------------- | ---------------------------------------------------- | +| **A: impl hash unchanged** | Only a comment changed | Skip entirely (<1ms) | +| **B: impl only** | Changed function body, same signature | Recompile this module, skip downstream | +| **C: sig changed** | Changed function parameter type | Recompile this module + cascade to direct dependents | #### Dependency graph traversal When a sig hash changes, the compiler walks the dependency graph in topological order. At each level, it recompiles affected modules in parallel, then checks -whether *their* sig hashes changed before propagating further. This ensures +whether _their_ sig hashes changed before propagating further. This ensures minimal recompilation: ```text @@ -554,9 +526,9 @@ t=100ms Begin incremental compile {A, B, C} Current stable events: -| Event | Trigger | Data | -|---|---|---| -| `compile_result` | Each incremental compilation cycle | File path, status (`ok`/`error`), diagnostics payload, timestamp | +| Event | Trigger | Data | +| ------------------- | ----------------------------------------------- | ---------------------------------------------------------------------------------- | +| `compile_result` | Each incremental compilation cycle | File path, status (`ok`/`error`), diagnostics payload, timestamp | | `hole_graph_update` | After a compile cycle that still contains holes | Hole-count summary: `holes_total`, `filled_this_cycle`, `ready_to_fill`, `blocked` | This summary stream is enough for save-compile feedback loops and for agents that pair watch mode with the richer batch hole commands (`sporec holes FILE --json`, `sporec query-hole FILE ?name --json`). @@ -567,12 +539,12 @@ Future revisions may layer more detailed events — for example per-hole updates Watch mode **never exits** (except on Ctrl+C). Recovery behavior: -| Situation | Strategy | -|-----------|----------| +| Situation | Strategy | +| ------------------- | ----------------------------------------------------------- | | Syntax / type error | Report diagnostic, retain last successful compilation state | -| Circular dependency | Report error, interrupt affected module subtree | -| File deleted | Remove from dependency graph, mark dependents as errored | -| Compiler panic | Catch and report as internal error, continue watching | +| Circular dependency | Report error, interrupt affected module subtree | +| File deleted | Remove from dependency graph, mark dependents as errored | +| Compiler panic | Catch and report as internal error, continue watching | #### Complete watch terminal output @@ -752,16 +724,18 @@ LSP diagnostics are published directly from the compiler's shared diagnostics pi "method": "textDocument/publishDiagnostics", "params": { "uri": "file:///project/src/auth.sp", - "diagnostics": [{ - "range": { - "start": { "line": 22, "character": 9 }, - "end": { "line": 22, "character": 24 } - }, - "severity": 1, - "code": "E0301", - "source": "spore", - "message": "type mismatch: expected `Token`, found `Str`" - }] + "diagnostics": [ + { + "range": { + "start": { "line": 22, "character": 9 }, + "end": { "line": 22, "character": 24 } + }, + "severity": 1, + "code": "E0301", + "source": "spore", + "message": "type mismatch: expected `Token`, found `Str`" + } + ] } } ``` @@ -789,14 +763,14 @@ A custom `spore/holeUpdate` notification is a **future** extension, not the curr The `spore` binary is built on a curated set of Rust crates chosen for correctness, minimal footprint, and future Spore self-hosting feasibility: -| Crate | Purpose | Bootstrap path | -|-------|---------|---------------| -| `bpaf` | Argument parsing (derive-based) | Parser combinator — natural fit for Spore | -| `ariadne` | Diagnostic rendering (span-based) | Thin wrapper on ANSI formatting | -| `owo-colors` | Terminal color output | Pure ANSI escape sequences | -| `notify` | File system watching (`watch` mode) | System call wrapper | -| `serde_json` | JSON serialization (`--json` flag) | Spore will need JSON support | -| `tracing` | Structured logging (`--verbose`) | Replaceable with print-based logging | +| Crate | Purpose | Bootstrap path | +| ------------ | ----------------------------------- | ----------------------------------------- | +| `bpaf` | Argument parsing (derive-based) | Parser combinator — natural fit for Spore | +| `ariadne` | Diagnostic rendering (span-based) | Thin wrapper on ANSI formatting | +| `owo-colors` | Terminal color output | Pure ANSI escape sequences | +| `notify` | File system watching (`watch` mode) | System call wrapper | +| `serde_json` | JSON serialization (`--json` flag) | Spore will need JSON support | +| `tracing` | Structured logging (`--verbose`) | Replaceable with print-based logging | **Self-hosting consideration:** Every crate above wraps a concept that Spore must eventually implement natively (parsing, formatting, file I/O, JSON). The @@ -895,9 +869,7 @@ make richer analysis payloads an explicit later layer. "message": "callee expects `Money` here" } ], - "notes": [ - "enclosing function `charge_customer` uses `PaymentGateway`" - ], + "notes": ["enclosing function `charge_customer` uses `PaymentGateway`"], "help": "try `Money.from_string(\"fifty dollars\")`", "related": [] } @@ -910,16 +882,18 @@ not part of the minimal stable contract. #### Diagnostic field requirements -| Field | Type | Required | Description | -|-------|------|----------|-------------| -| `code` | string | **required** | Stable diagnostic code, for example `E0301` | -| `severity` | string | **required** | `error`, `warning`, or `note` | -| `message` | string | **required** | Human-readable headline | -| `primary_span` | object \| null | **required** | Primary source location; `null` only for global diagnostics | -| `secondary_labels` | array | optional | Additional labeled spans related to the same diagnostic | -| `notes` | string[] | optional | Supplemental context lines | -| `help` | string | optional | One actionable fix hint or next step | -| `related` | array | optional | References to related diagnostics or locations | +| Field | Type | Minimal canonical | Required | Description | +| ------------------ | -------------- | :---------------: | ------------ | ----------------------------------------------------------- | +| `code` | string | ✓ | **required** | Stable diagnostic code, for example `E0301` | +| `severity` | string | ✓ | **required** | `error`, `warning`, or `note` | +| `message` | string | ✓ | **required** | Human-readable headline | +| `primary_span` | object \| null | ✓ | **required** | Primary source location; `null` only for global diagnostics | +| `secondary_labels` | array | | optional | Additional labeled spans related to the same diagnostic | +| `notes` | string[] | | optional | Supplemental context lines | +| `help` | string | | optional | One actionable fix hint or next step | +| `related` | array | | optional | References to related diagnostics or locations | + +The **minimal canonical** fields (`code`, `severity`, `message`, `primary_span`) form the stable machine contract. The remaining fields are extension fields that may be absent without breaking consumers. ```json { @@ -963,26 +937,24 @@ not part of the minimal stable contract. } ``` -> **Current implementation status (2026-04):** -> -> The shared-model architecture is now the right description of the codebase: hole -> commands already expose a shared typed-hole protocol, and CLI JSON plus LSP are -> being treated as projections over the same diagnostics pipeline. The remaining -> work is to finish freezing the minimal canonical field set and to extend richer -> analysis/watch transports without reintroducing command-private schemas. +> The shared-model architecture is the design target: hole +> commands expose a shared typed-hole protocol, and CLI JSON plus LSP are +> treated as projections over the same diagnostics pipeline. The remaining +> design work is freezing the minimal canonical field set and extending richer +> analysis/watch transports without introducing command-private schemas. #### Error code system All diagnostics carry categorized codes: -| Prefix | Category | Examples | -|--------|----------|----------| -| `E0xxx` | Type errors | type mismatch, missing field, arity error | -| `W0xxx` | Warnings | unused variable, redundant pattern, shadowing | -| `C0xxx` | Effect violations | undeclared effect, exceeding ceiling | -| `K0xxx` | Cost violations | budget exceeded, unbounded call | -| `H0xxx` | Hole diagnostics | hole report, partial function, type conflict | -| `M0xxx` | Module errors | circular dependency, visibility violation | +| Prefix | Category | Examples | +| ------- | ----------------- | --------------------------------------------- | +| `E0xxx` | Type errors | type mismatch, missing field, arity error | +| `W0xxx` | Warnings | unused variable, redundant pattern, shadowing | +| `C0xxx` | Effect violations | undeclared effect, exceeding ceiling | +| `K0xxx` | Cost violations | budget exceeded, unbounded call | +| `H0xxx` | Hole diagnostics | hole report, partial function, type conflict | +| `M0xxx` | Module errors | circular dependency, visibility violation | Every code is queryable: `sporec explain E0301` prints a detailed explanation with common causes, examples, and fix strategies. @@ -1020,16 +992,16 @@ $ sporec explain E0301 Target latencies (aspirational, not hard guarantees): -| Operation | Target | Notes | -|-----------|--------|-------| -| Single module recompilation | < 100ms | Type check + effect + cost | -| Dependency graph traversal | < 10ms | Topological sort of module DAG | -| Hash computation (single module) | < 5ms | blake3 of module content | -| Full project initial analysis (~100 modules) | < 5s | Cold start, parallel compilation | -| End-to-end latency (file save → diagnostics) | < 200ms | Including 100ms debounce | -| Hole graph update | < 50ms | Incremental hole summary / `HoleGraphUpdate` regeneration | +| Operation | Target | Notes | +| -------------------------------------------- | ------- | --------------------------------------------------------- | +| Single module recompilation | < 100ms | Type check + effect + cost | +| Dependency graph traversal | < 10ms | Topological sort of module DAG | +| Hash computation (single module) | < 5ms | blake3 of module content | +| Full project initial analysis (~100 modules) | < 5s | Cold start, parallel compilation | +| End-to-end latency (file save → diagnostics) | < 200ms | Including 100ms debounce | +| Hole graph update | < 50ms | Incremental hole summary / `HoleGraphUpdate` regeneration | -**Cache strategy:** Watch mode maintains in-memory caches for hashes, dependency graph, compilation artifacts, and hole state. Persistence to disk is not required for v0.1 but is a future option. +**Cache strategy:** Watch mode maintains in-memory caches for hashes, dependency graph, compilation artifacts, and hole state. Persistence to disk is not required for the current design but is a future option. **Parallelism:** Default = CPU core count. Override with `--jobs N`. @@ -1068,6 +1040,56 @@ is passed. **Lint warnings** (severity = Warning, code prefix `W`) follow the same rule. +#### `spore test` + +`spore test` evaluates test files and function-level `spec` blocks. SEP-0001 +owns the syntax of `spec { ... }`; this SEP owns the compiler and runner behavior +for executing examples, generating law inputs, reporting failures, and +tracking spec hashes. + +```spore +fn add(a: I64, b: I64) -> I64 +spec { + example "positive inputs": add(2, 3) == 5 + example "identity": add(0, 42) == 42 +} +{ + a + b +} +``` + +Examples may use expression or block form. In block form, the final expression is +the assertion result: + +```spore +example "handles leap year" { + let d = parse_date("2024-02-29"); + d == Ok(Date { year: 2024, month: 2, day: 29 }) +} +``` + +Laws use explicitly typed lambda parameters and must evaluate to `Bool`: + +```spore +fn parse_date(s: Str) -> Result[Date, ParseError] ! ParseError +spec { + example "ISO date": parse_date("2024-01-15").is_ok() + law "round trip": |d: Date| parse_date(d.to_iso_string()) == Ok(d) +} +{ + ?parse_logic +} +``` + +If a `spec` calls a function body that still contains a hole, normal hole +runtime behavior applies during test execution. The `spec` remains available as +compiler metadata and may be surfaced through hole tooling as described in +SEP-0005. + +`property` is not a compatibility alias. The accepted spec item set is exactly +`example` and `law`; source that still uses `property` is rejected and must be +rewritten by migration tooling or by hand. + #### `spore format` / `spore fmt` `spore format` rewrites source files to conform to the canonical Spore style. @@ -1112,7 +1134,7 @@ analysis gate: The incremental compilation system eliminates the most frustrating part of the edit-compile-test loop: unnecessary waiting. When a developer changes a function -body without altering its signature, *only that module* is recompiled. Downstream +body without altering its signature, _only that module_ is recompiled. Downstream modules are untouched. In practice this means sub-100ms feedback for most edits. ### Actionable diagnostics @@ -1188,14 +1210,14 @@ Diagnostic IR. #### Agent mode selection -| Scenario | Recommended Mode | Rationale | -|----------|-----------------|-----------| -| Agent filling holes | `spore check --json` + hole JSON commands | Stable machine-readable diagnostics plus hole-specific context | -| Agent debugging compiler error | `--json` | Consume code/severity/message/spans/notes/help without parsing text | -| Agent in CI pipeline | `--json` | Stable machine-readable contract | -| Human scanning build output | Default | Concise, color-coded, glanceable | -| Human debugging inference failure | `--verbose` | Richer renderer over the same base diagnostics | -| IDE / LSP client | LSP adapter over Diagnostic IR | Same canonical fields, editor-specific transport | +| Scenario | Recommended Mode | Rationale | +| --------------------------------- | ----------------------------------------- | ------------------------------------------------------------------- | +| Agent filling holes | `spore check --json` + hole JSON commands | Stable machine-readable diagnostics plus hole-specific context | +| Agent debugging compiler error | `--json` | Consume code/severity/message/spans/notes/help without parsing text | +| Agent in CI pipeline | `--json` | Stable machine-readable contract | +| Human scanning build output | Default | Concise, color-coded, glanceable | +| Human debugging inference failure | `--verbose` | Richer renderer over the same base diagnostics | +| IDE / LSP client | LSP adapter over Diagnostic IR | Same canonical fields, editor-specific transport | #### Agent error recovery @@ -1258,21 +1280,18 @@ separate schema. Target mapping from the canonical Diagnostic IR: -| Diagnostic IR | LSP `Diagnostic` | Notes | -|---------------|------------------|-------| -| `severity` | `severity` | `error=1`, `warning=2`, `note=3` | -| `code` | `code` | direct copy | -| `message` | `message` | direct copy | -| `primary_span.range` | `range` | convert 1-indexed line/col to LSP 0-indexed positions | -| `related` | `relatedInformation` | when representable | -| `secondary_labels`, `notes`, `help` | `data.spore` | retained even when the base LSP surface cannot render them losslessly | - -> **Current implementation status (2026-04):** -> -> The LSP server already publishes diagnostics from the compiler pipeline. The -> remaining gap is lossless carriage of richer fields such as secondary labels, -> notes, help, and hole-specific updates once the minimal canonical shape is fully -> frozen. +| Diagnostic IR | LSP `Diagnostic` | Notes | +| ----------------------------------- | -------------------- | --------------------------------------------------------------------- | +| `severity` | `severity` | `error=1`, `warning=2`, `note=3` | +| `code` | `code` | direct copy | +| `message` | `message` | direct copy | +| `primary_span.range` | `range` | convert 1-indexed line/col to LSP 0-indexed positions | +| `related` | `relatedInformation` | when representable | +| `secondary_labels`, `notes`, `help` | `data.spore` | retained even when the base LSP surface cannot render them losslessly | + +> The LSP server publishes diagnostics from the compiler pipeline. The +> design goal is lossless carriage of richer fields such as secondary labels, +> notes, help, and hole-specific updates within the LSP protocol. ### JSON as the machine projection @@ -1339,16 +1358,16 @@ The `/ | |___^` format connects multi-line spans visually. #### ANSI color scheme -| Element | Color | ANSI Code | -|---------|-------|-----------| -| `error` + error code | Red (bold) | `\x1b[1;31m` | -| `warning` + warning code | Yellow (bold) | `\x1b[1;33m` | -| `note` | Blue (bold) | `\x1b[1;34m` | -| `help` | Green (bold) | `\x1b[1;32m` | -| `hint` | Cyan (bold) | `\x1b[1;36m` | -| Line numbers | Bright blue | `\x1b[94m` | -| Source text | Default | — | -| Underline (`^^^`) | Matches severity color | — | +| Element | Color | ANSI Code | +| ------------------------ | ---------------------- | ------------ | +| `error` + error code | Red (bold) | `\x1b[1;31m` | +| `warning` + warning code | Yellow (bold) | `\x1b[1;33m` | +| `note` | Blue (bold) | `\x1b[1;34m` | +| `help` | Green (bold) | `\x1b[1;32m` | +| `hint` | Cyan (bold) | `\x1b[1;36m` | +| Line numbers | Bright blue | `\x1b[94m` | +| Source text | Default | — | +| Underline (`^^^`) | Matches severity color | — | Colors are disabled when output is piped (not a TTY) or when `--no-color` is passed. The `--json` mode never includes ANSI codes. @@ -1409,19 +1428,53 @@ note[H0101]: hole `tax_logic` requires filling help: run `sporec query-hole src/tax.sp ?tax_logic` for full HoleReport ``` +**Spec example failure:** + +```text +error[S0101]: spec example failed + --> src/dates.sp:5:5 + | + 5 | example "ISO date": parse_date("2024-01-15").is_ok() + | ^^^^^^^^^^^^^^^^^^ evaluated to false +``` + +**Spec law counterexample:** + +```text +error[S0102]: spec law counterexample + --> src/dates.sp:9:5 + | + 9 | law "round trip": |d: Date| parse_date(d.to_iso_string()) == Ok(d) + | ^^^^^^^^^^^^^^^^ counterexample found + | + = note: d = Date { year: 2024, month: 2, day: 30 } +``` + +**Missing spec warning:** + +```text +warning[W0401]: public function has no spec block + --> src/dates.sp:3:1 + | + 3 | pub fn parse_date(s: Str) -> Result[Date, ParseError] ! ParseError { + | ^^^^^^^^^^ no behavioral contract declared + | +help: add a `spec { example "...": ... }` block before the function body +``` + ### Recovery strategies The compiler employs error recovery to report as many errors as possible per compilation rather than stopping at the first: -| Phase | Recovery strategy | -|-------|------------------| -| Lexer | Skip to next recognizable token boundary | -| Parser | Synchronize at statement/declaration boundaries; insert synthetic nodes | -| Resolve | Mark unresolved names as `ErrorType`, continue checking | -| TypeCheck | Propagate `ErrorType` without cascading false errors | -| CapCheck | Report violation but continue checking remaining uses | -| CostCheck | Report budget exceedance but compute total cost for diagnostics | +| Phase | Recovery strategy | +| --------- | ----------------------------------------------------------------------- | +| Lexer | Skip to next recognizable token boundary | +| Parser | Synchronize at statement/declaration boundaries; insert synthetic nodes | +| Resolve | Mark unresolved names as `ErrorType`, continue checking | +| TypeCheck | Propagate `ErrorType` without cascading false errors | +| CapCheck | Report violation but continue checking remaining uses | +| CostCheck | Report budget exceedance but compute total cost for diagnostics | **Deduplication policy:** when a single root cause produces multiple downstream errors, the compiler shows the root error in full and collapses downstream @@ -1433,116 +1486,125 @@ All diagnostics carry a categorized code. Every code is queryable: `sporec expla #### E0xxx — Type Errors -| Code | Name | Description | -|------|------|-------------| -| `E0101` | missing-field | Struct literal missing a required field | -| `E0102` | unknown-field | Struct literal contains a field not in the type definition | -| `E0103` | duplicate-field | Struct literal contains the same field twice | -| `E0104` | field-type-mismatch | Struct field value type does not match declaration | -| `E0105` | tuple-length-mismatch | Tuple has wrong number of elements | -| `E0106` | missing-variant-field | Enum variant constructor missing a field | -| `E0107` | record-vs-tuple | Used record syntax where tuple expected, or vice versa | -| `E0108` | non-struct-field-access | Field access on a non-struct type | -| `E0109` | private-field-access | Accessing a private field from outside the defining module | -| `E0110` | spread-type-mismatch | Spread operator (`..base`) type does not match struct | -| `E0201` | arity-mismatch | Function called with wrong number of arguments | -| `E0202` | named-arg-mismatch | Named argument does not match any parameter | -| `E0203` | missing-required-arg | Required argument not provided | -| `E0204` | duplicate-arg | Same argument provided twice | -| `E0301` | type-mismatch | Expression type incompatible with expected type | -| `E0302` | return-type-mismatch | Function body returns wrong type | -| `E0303` | error-type-mismatch | Function raises undeclared error type | -| `E0304` | if-branch-mismatch | If/else branches have different types | -| `E0305` | match-arm-mismatch | Match arms have different types | -| `E0306` | operator-type-error | Operator applied to incompatible types | -| `E0307` | index-type-error | Non-integer used as index | -| `E0308` | not-callable | Attempt to call a non-function value | -| `E0309` | pipe-type-mismatch | Pipe operator left-hand side incompatible with right-hand function | -| `E0310` | lambda-return-mismatch | Lambda body type does not match expected return | -| `E0401` | constraint-not-satisfied | Generic type does not satisfy trait constraint | -| `E0402` | ambiguous-type | Type inference cannot determine a unique type | -| `E0403` | recursive-type | Infinitely recursive type definition | -| `E0404` | gat-mismatch | Generic associated type arguments do not match | -| `E0501` | pattern-exhaustiveness | Match does not cover all variants | -| `E0502` | pattern-type-mismatch | Pattern type does not match scrutinee type | -| `E0503` | duplicate-pattern | Same pattern appears twice in match | -| `E0504` | guard-type-error | Match guard expression is not Bool | +| Code | Name | Description | +| ------- | ------------------------ | ------------------------------------------------------------------ | +| `E0101` | missing-field | Struct literal missing a required field | +| `E0102` | unknown-field | Struct literal contains a field not in the type definition | +| `E0103` | duplicate-field | Struct literal contains the same field twice | +| `E0104` | field-type-mismatch | Struct field value type does not match declaration | +| `E0105` | tuple-length-mismatch | Tuple has wrong number of elements | +| `E0106` | missing-variant-field | Enum variant constructor missing a field | +| `E0107` | record-vs-tuple | Used record syntax where tuple expected, or vice versa | +| `E0108` | non-struct-field-access | Field access on a non-struct type | +| `E0109` | private-field-access | Accessing a private field from outside the defining module | +| `E0110` | spread-type-mismatch | Spread operator (`..base`) type does not match struct | +| `E0201` | arity-mismatch | Function called with wrong number of arguments | +| `E0202` | named-arg-mismatch | Named argument does not match any parameter | +| `E0203` | missing-required-arg | Required argument not provided | +| `E0204` | duplicate-arg | Same argument provided twice | +| `E0301` | type-mismatch | Expression type incompatible with expected type | +| `E0302` | return-type-mismatch | Function body returns wrong type | +| `E0303` | error-type-mismatch | Function raises undeclared error type | +| `E0304` | if-branch-mismatch | If/else branches have different types | +| `E0305` | match-arm-mismatch | Match arms have different types | +| `E0306` | operator-type-error | Operator applied to incompatible types | +| `E0307` | index-type-error | Non-integer used as index | +| `E0308` | not-callable | Attempt to call a non-function value | +| `E0309` | pipe-type-mismatch | Pipe operator left-hand side incompatible with right-hand function | +| `E0310` | lambda-return-mismatch | Lambda body type does not match expected return | +| `E0401` | constraint-not-satisfied | Generic type does not satisfy trait constraint | +| `E0402` | ambiguous-type | Type inference cannot determine a unique type | +| `E0403` | recursive-type | Infinitely recursive type definition | +| `E0404` | gat-mismatch | Generic associated type arguments do not match | +| `E0501` | pattern-exhaustiveness | Match does not cover all variants | +| `E0502` | pattern-type-mismatch | Pattern type does not match scrutinee type | +| `E0503` | duplicate-pattern | Same pattern appears twice in match | +| `E0504` | guard-type-error | Match guard expression is not Bool | #### W0xxx — Warnings -| Code | Name | Description | -|------|------|-------------| -| `W0101` | unused-variable | Variable bound but never read | -| `W0102` | unused-import | Module import never referenced | -| `W0103` | unused-function | Private function never called | -| `W0104` | unused-type | Private type never referenced | -| `W0105` | unused-effect | Declared effect never exercised | -| `W0201` | redundant-pattern | Match arm unreachable due to prior arm | -| `W0202` | redundant-constraint | Generic constraint implied by another | -| `W0203` | redundant-parentheses | Unnecessary parentheses around expression | -| `W0301` | shadowing | Variable shadows binding in outer scope | -| `W0302` | implicit-discard | Expression result discarded without explicit `_` | +| Code | Name | Description | +| ------- | --------------------- | ------------------------------------------------ | +| `W0101` | unused-variable | Variable bound but never read | +| `W0102` | unused-import | Module import never referenced | +| `W0103` | unused-function | Private function never called | +| `W0104` | unused-type | Private type never referenced | +| `W0105` | unused-effect | Declared effect never exercised | +| `W0201` | redundant-pattern | Match arm unreachable due to prior arm | +| `W0202` | redundant-constraint | Generic constraint implied by another | +| `W0203` | redundant-parentheses | Unnecessary parentheses around expression | +| `W0301` | shadowing | Variable shadows binding in outer scope | +| `W0302` | implicit-discard | Expression result discarded without explicit `_` | +| `W0401` | missing-spec | Public function has no `spec` block | #### C0xxx — Effect Violations -| Code | Name | Description | -|------|------|-------------| -| `C0101` | undeclared-effect | Function uses effect not in `uses` | -| `C0102` | exceeds-ceiling | Function `uses` exceeds module ceiling | -| `C0103` | callee-effect-leak | Calling function whose `uses` exceeds caller's | -| `C0104` | transitive-effect | Transitive callee introduces undeclared effect | +| Code | Name | Description | +| ------- | ---------------------- | ----------------------------------------------- | +| `C0101` | undeclared-effect | Function uses effect not in `uses` | +| `C0102` | exceeds-ceiling | Function `uses` exceeds module ceiling | +| `C0103` | callee-effect-leak | Calling function whose `uses` exceeds caller's | +| `C0104` | transitive-effect | Transitive callee introduces undeclared effect | | `C0201` | platform-effect-denied | Package requests effect Platform does not grant | -| `C0202` | platform-missing | No Platform provides required effect | -| `C0301` | effect-purity-conflict | `uses []` (pure) function calls impure code | +| `C0202` | platform-missing | No Platform provides required effect | +| `C0301` | effect-purity-conflict | `uses []` (pure) function calls impure code | #### K0xxx — Cost Violations -| Code | Name (implementation) | Short description | -|------|----------------------|-------------------| -| `K0101` | cost budget exceeded | Inferred cost exceeds declared bound (warning in current policy) | -| `K0102` | cost annotation mismatch | Declared `cost [...]` does not match inferred cost | -| `K0001` | cost budget exceeded | Legacy alias for `K0101` | -| `K0201` | unbounded recursion detected | Recursion pattern not structurally bounded | -| `K0202` | loop without bounded iteration | Reserved / iteration-path diagnostic | -| `K0301` | missing cost on recursive function | Recursive function lacks `cost [...]` where required | -| `K0302` | invalid cost expression | `cost [...]` parse or shape error | -| `K0303` | `@unbounded` requires cost | `@unbounded` without `cost [c, a, i, p]` (hard error) | +| Code | Name (implementation) | Short description | +| ------- | ---------------------------------- | ---------------------------------------------------------------- | +| `K0101` | cost budget exceeded | Inferred cost exceeds declared bound (warning in current policy) | +| `K0102` | cost annotation mismatch | Declared `cost [...]` does not match inferred cost | +| `K0001` | cost budget exceeded | Legacy alias for `K0101` | +| `K0201` | unbounded recursion detected | Recursion pattern not structurally bounded | +| `K0202` | loop without bounded iteration | Reserved / iteration-path diagnostic | +| `K0301` | missing cost on recursive function | Recursive function lacks `cost [...]` where required | +| `K0302` | invalid cost expression | `cost [...]` parse or shape error | +| `K0303` | `@unbounded` requires cost | `@unbounded` without `cost [c, a, i, p]` (hard error) | -Canonical enum and messages: `spore` repo `crates/sporec-typeck/src/error.rs`. +Canonical error codes and messages are defined in the error code registry. #### H0xxx — Hole Diagnostics -| Code | Name (implementation) | Short description | -|------|----------------------|-------------------| -| `H0101` | typed hole found | Standard hole report entry | -| `H0102` | hole with inferred type | Hole with type context | -| `H0103` | hole in return position | Hole where return is expected | -| `H0201` | hole candidates available | Ranked fills exist | -| `H0202` | no candidates found | No suitable fill | -| `H0203` | ambiguous candidates | Multiple close matches | -| `H0301` | hole depends on another hole | Ordering / dependency | -| `H0302` | circular hole dependency | Dependency cycle | +| Code | Name (implementation) | Short description | +| ------- | ---------------------------- | ----------------------------- | +| `H0101` | typed hole found | Standard hole report entry | +| `H0102` | hole with inferred type | Hole with type context | +| `H0103` | hole in return position | Hole where return is expected | +| `H0201` | hole candidates available | Ranked fills exist | +| `H0202` | no candidates found | No suitable fill | +| `H0203` | ambiguous candidates | Multiple close matches | +| `H0301` | hole depends on another hole | Ordering / dependency | +| `H0302` | circular hole dependency | Dependency cycle | + +See §H0xxx — Hole Diagnostics in the error code registry. + +#### S0xxx — Spec Diagnostics -Canonical enum: `spore` repo `crates/sporec-typeck/src/error.rs` (SEP tables may lag other `H` variants). +| Code | Name | Description | +| ------- | ----------------------- | --------------------------------------- | +| `S0101` | spec-example-failed | `example` item evaluated to false | +| `S0102` | spec-law-counterexample | `law` item produced a counterexample | +| `S0201` | spec-body-hit-hole | Spec execution reached an unfilled hole | #### M0xxx — Module Errors -| Code | Name | Description | -|------|------|-------------| -| `M0101` | circular-dependency | Modules form a dependency cycle | -| `M0102` | self-import | Module imports itself | -| `M0103` | duplicate-module | Two files map to the same module path | -| `M0201` | visibility-violation | Accessing private or `pub(pkg)` symbol from outside scope | -| `M0202` | re-export-visibility | Re-exporting with broader visibility than original | -| `M0203` | orphan-impl | Implementing external trait for external type | -| `M0204` | alias-chain | `pub alias` points to another alias instead of a concrete export | -| `M0301` | import-not-found | Imported module or symbol does not exist | -| `M0302` | ambiguous-import | Two imports bring same name into scope | -| `M0303` | wildcard-import | Wildcard imports are not allowed in Spore | -| `M0304` | import-shadowing-conflict | Import alias conflicts with module name or existing import binding | -| `M0401` | snapshot-changed | Dependent signature hash changed; requires `--permit` | -| `M0402` | snapshot-missing | Referenced snapshot not found in `.spore-lock` | -| `M0501` | platform-binding-conflict | Project declares more than one Platform binding | +| Code | Name | Description | +| ------- | ------------------------- | ---------------------------------------------------------------------- | +| `M0101` | circular-dependency | Modules form a dependency cycle | +| `M0102` | self-import | Module imports itself | +| `M0103` | duplicate-module | Two files map to the same module path | +| `M0201` | visibility-violation | Accessing private or `pub(pkg)` symbol from outside scope | +| `M0202` | re-export-visibility | Re-exporting with broader visibility than original | +| `M0203` | orphan-impl | Implementing external trait for external type | +| `M0204` | alias-chain | `pub alias` points to another alias instead of a concrete export | +| `M0301` | import-not-found | Imported module or symbol does not exist | +| `M0302` | ambiguous-import | Two imports bring same name into scope | +| `M0303` | wildcard-import | Wildcard imports are not allowed in Spore | +| `M0304` | import-shadowing-conflict | Import alias conflicts with module name or existing import binding | +| `M0401` | snapshot-changed | Dependent signature hash changed; requires `--permit` | +| `M0402` | snapshot-missing | Referenced snapshot not found in `.spore-lock` | +| `M0501` | platform-binding-conflict | Project declares more than one Platform binding | | `M0502` | startup-contract-mismatch | Startup function signature does not satisfy selected Platform contract | ### System integration @@ -1744,26 +1806,25 @@ The prefix convention (E/W/C/K/H/M) maps directly to Spore's major subsystems. ## Prior art -| Language | Pipeline | What Spore borrows | -|----------|----------|-------------------| -| **Rust** | AST → HIR → THIR → MIR → LLVM IR | HIR/TypedHIR concepts; salsa for incremental compilation (via rust-analyzer); Rust-style diagnostic layout | -| **Zig** | AST → ZIR → AIR → Machine IR | Pratt parser for expressions; *not* the ZIR concept (no comptime) | -| **Roc** | AST → Canonical → Solved → IR | Canonical ≈ HIR concept; name resolution approach | -| **Gleam** | Untyped AST → Typed AST | Inspiration for simplicity in the 2-layer approach; confirmed that 2 layers are too few for Spore | -| **Gonidium** | AST → TypedDag | `DiagCollector` error collection pattern | -| **Elm** | AST → Canonical → Typed → Optimized | Philosophy of helpful error messages; Spore adopts the helpfulness but uses Rust's concise layout rather than Elm's paragraph style | +| Language | Pipeline | What Spore borrows | +| ------------ | ----------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------- | +| **Rust** | AST → HIR → THIR → MIR → LLVM IR | HIR/TypedHIR concepts; salsa for incremental compilation (via rust-analyzer); Rust-style diagnostic layout | +| **Zig** | AST → ZIR → AIR → Machine IR | Pratt parser for expressions; _not_ the ZIR concept (no comptime) | +| **Roc** | AST → Canonical → Solved → IR | Canonical ≈ HIR concept; name resolution approach | +| **Gleam** | Untyped AST → Typed AST | Inspiration for simplicity in the 2-layer approach; confirmed that 2 layers are too few for Spore | +| **Gonidium** | AST → TypedDag | `DiagCollector` error collection pattern | +| **Elm** | AST → Canonical → Typed → Optimized | Philosophy of helpful error messages; Spore adopts the helpfulness but uses Rust's concise layout rather than Elm's paragraph style | -### salsa (Rust ecosystem) +### Incremental computation framework (Prior Art) -salsa is a framework for demand-driven incremental computation, originally -developed for rust-analyzer. It provides: +Demand-driven incremental computation frameworks provide: - Automatic memoization of query results - Fine-grained dependency tracking - Automatic invalidation and recomputation on input changes - Durability layers for optimizing frequently vs. rarely changing inputs -Spore uses salsa to make each compilation pass a tracked query, enabling +The compiler uses demand-driven incremental computation to make each compilation pass a tracked query, enabling automatic incremental recomputation when source files change. ### Cranelift @@ -1782,15 +1843,15 @@ debug builds. Key properties: ## Backward compatibility and migration -### PoC → Prototype migration +### Migration from interpreter to native codegen -The PoC phase uses a **tree-walking interpreter** in place of Cranelift codegen. +An initial implementation may use a **tree-walking interpreter** in place of native codegen. The migration path: 1. The `codegen` pass is behind a clean abstraction boundary (it receives `TypedHIR` and produces an executable result) -2. Replacing the tree-walking interpreter with Cranelift codegen requires - implementing the `TypedHIR → Cranelift IR` translation — no changes to +2. Replacing the tree-walking interpreter with native codegen requires + implementing the `TypedHIR → native` translation — no changes to passes 1–4 3. The `--json` diagnostic format is identical in both phases; tools built against PoC diagnostics continue to work in prototype @@ -1801,9 +1862,8 @@ Error codes (E0xxx, W0xxx, etc.) are considered stable once assigned. The meaning of a code does not change, though the message text may be improved. CI/CD pipelines and agents should filter on error codes, not message strings. -If the JSON schema needs breaking changes, the `"version"` field will be -incremented and a migration period provided where both old and new schemas are -emitted. +If the JSON schema needs breaking changes, the schema identifier will change and +a migration period will provide both old and new schema shapes. ### Future IR additions @@ -1856,7 +1916,7 @@ TypedHIR layer; a new "opt hash" could gate the MIR → Cranelift translation. 7. **Internationalization of diagnostics.** Error codes are language-independent, but message text is currently English-only. Should `sporec explain` support - localized explanations in a future version? This would require a translation + localized explanations in a future release? This would require a translation infrastructure. 8. **Cost visualization in verbose mode.** Should `--verbose` include ASCII bar @@ -1895,13 +1955,13 @@ Suppression is limited to **warnings only** — errors and notes cannot be suppr ## Glossary -| Term | Definition | -|------|-----------| -| **impl hash** | Hash of module implementation content; determines whether to recompile this module | -| **sig hash** | Hash of module public interface; determines whether downstream dependents need rechecking | -| **hole** | Source-code placeholder (`?name`) for unfinished implementation, carrying type information | -| **effect** | A declared system permission (e.g., `Network`, `FileSystem`) required to perform certain operations | -| **effect ceiling** | Project-level maximum effect set; no module may exceed it | -| **cost annotation** | A declared four-slot upper bound (`cost [compute, alloc, io, parallel]`) | -| **debounce** | Coalescing multiple rapid file-system events into a single compilation trigger | -| **NDJSON** | Newline-Delimited JSON — each line is a complete, independent JSON object | +| Term | Definition | +| ------------------- | --------------------------------------------------------------------------------------------------------- | +| **impl hash** | Hash of module implementation content; determines whether to recompile this module | +| **sig hash** | Hash of module public interface; determines whether downstream dependents need rechecking | +| **hole** | Source-code placeholder (`?name`) for unfinished implementation, carrying type information | +| **effect** | A declared capability such as `NetConnect`, `FileRead`, or `Spawn` required to perform certain operations | +| **effect ceiling** | Reserved module/project policy concept; function-level `uses [...]` checking is the stable surface | +| **cost annotation** | A declared four-slot upper bound (`cost [compute, alloc, io, parallel]`) | +| **debounce** | Coalescing multiple rapid file-system events into a single compilation trigger | +| **NDJSON** | Newline-Delimited JSON — each line is a complete, independent JSON object | diff --git a/seps/SEP-0007-concurrency-model.md b/seps/SEP-0007-concurrency-model.md index d7fb451..5866101 100644 --- a/seps/SEP-0007-concurrency-model.md +++ b/seps/SEP-0007-concurrency-model.md @@ -8,6 +8,7 @@ authors: created: 2026-03-31 requires: - 1 + - 2 - 3 - 4 discussion: "https://github.com/spore-lang/spore-evolution/discussions/7" @@ -333,7 +334,7 @@ uses [Channel, Spawn] Server entrypoints that accept inbound connections declare `NetListen`; the inner request handler below only needs outbound `NetConnect` plus domain-specific effects. ```spore -effect HttpHandler = Spawn | NetConnect | DbRead | Clock +effect HttpHandler = Spawn | NetConnect | DbRead | Clock; fn handle_request(req: Request) -> Response ! DbError | Timeout cost [5000, 800, 200, 3] @@ -1039,7 +1040,7 @@ $ sporec --query-task-tree handle_request ### Handler binding as protocol -At the application boundary (the `main` function or service entry point), handler binding forms a protocol specification: *"this application uses real parallelism with a pool of N threads, real network IO, and PostgreSQL for database"*. This binding can be serialized and versioned as part of the deployment configuration. +At the application boundary (the `main` function or service entry point), handler binding forms a protocol specification: *"this application uses real parallelism with a pool of N threads, real network IO, and PostgreSQL for database"*. This binding can be serialized as part of the deployment configuration. ### Integration with Spore subsystems @@ -1135,7 +1136,7 @@ This timeline can be rendered as a Gantt chart, flame graph, or task dependency **Rejected.** Lock contention cost is unpredictable (spin/retry count depends on runtime scheduling). Deadlock detection in the presence of shared memory and locks is NP-hard. The effect handler + continuation model does not compose cleanly with lock semantics. -**Consequences of rejection:** Performance-critical lock-free patterns (e.g., atomic counters) are unavailable in safe Spore. A future `unsafe_shared` extension point is reserved but not in v0.1. +**Consequences of rejection:** Performance-critical lock-free patterns (e.g., atomic counters) are unavailable in safe Spore. A future `unsafe_shared` extension point is reserved but not in the accepted surface. ### Alternative 6: Unstructured spawning with optional scoping @@ -1335,7 +1336,7 @@ uses [Spawn, Channel, ...] ### A.9 Complete example: HTTP handler ```spore -effect HttpHandler = Spawn | NetConnect | DbRead | Clock +effect HttpHandler = Spawn | NetConnect | DbRead | Clock; fn handle_request(req: Request) -> Response ! DbError | Timeout cost [5000, 800, 200, 3] diff --git a/seps/SEP-0008-module-package-system.md b/seps/SEP-0008-module-package-system.md index e11ab2f..1e2e70d 100644 --- a/seps/SEP-0008-module-package-system.md +++ b/seps/SEP-0008-module-package-system.md @@ -8,7 +8,9 @@ authors: created: 2026-03-31 requires: - 1 + - 2 - 3 + - 4 discussion: "https://github.com/spore-lang/spore-evolution/discussions/8" pr: null superseded_by: null @@ -16,11 +18,11 @@ superseded_by: null # SEP-0008: Module & Package System -> **Executive Summary**: Defines a content-addressed package system (BLAKE3 dual-hash) with platform-as-package architecture, where a selected Platform package declares its startup contract, host adapter, and handled effects via manifest metadata and package modules. Uses 1-file-=1-module mapping with dot-separated import paths (`import billing.invoice`) and `spore.toml` for project configuration. Enforces visibility rules (pub/pub(pkg)/private) at module boundaries, supports multi-file compilation with explicit import resolution, and achieves supply-chain security through explicit function-level effects plus Platform metadata. Broader project/file effect ceilings remain follow-up work rather than part of the current shipped contract. +> **Executive Summary**: Defines a content-addressed package system (BLAKE3 dual-hash) with platform-as-package architecture, where a selected Platform package declares its startup contract, host adapter, and handled effects via manifest metadata and package modules. Uses 1-file-=1-module mapping with dot-separated import paths (`import billing.invoice`) and `spore.toml` for project configuration. Enforces visibility rules (pub/pub(pkg)/private) at module boundaries, supports multi-file compilation with explicit import resolution, and achieves supply-chain security through explicit function-level effects plus Platform metadata. Broader project/file effect ceilings remain follow-up work. ## Summary -This SEP specifies the complete module and package system for the Spore programming language. It defines how code is organized (one file = one module, no explicit `module` declarations), how modules are imported via dot-separated paths (`import billing.invoice`), how visibility is controlled (private / `pub(pkg)` / `pub`), how functions are content-addressed via a dual-hash scheme (signature hash + implementation AST hash using BLAKE3), how packages are structured around `spore.toml` manifests with a `.spore-lock` for reproducible builds, how dependencies are resolved without semantic versioning, and how IO is abstracted through selected Platform packages and startup contracts. Additional project-wide or file-level effect ceilings remain reserved for later SEP work instead of being part of the current implementation-aligned contract. +This SEP specifies the complete module and package system for the Spore programming language. It defines how code is organized (one file = one module, no explicit `module` declarations), how modules are imported via dot-separated paths (`import billing.invoice`), how visibility is controlled (private / `pub(pkg)` / `pub`), how functions are content-addressed via a dual-hash scheme (signature hash + implementation AST hash using BLAKE3), how packages are structured around `spore.toml` manifests with a `.spore-lock` for reproducible builds, how dependencies are resolved without semantic versioning, and how IO is abstracted through selected Platform packages and startup contracts. Additional project-wide or file-level effect ceilings remain reserved for later SEP work. The design synthesizes ideas from Unison (content-addressing), Roc (platform/package separation), Elixir/Python (dot-separated import paths), Elm/Go (file-based modules, no circular dependencies), and Rust (three-level visibility), while introducing novel concepts such as the import/alias separation, reserved effect-ceiling design space, and the dual-hash content-addressing scheme that decouples API stability from implementation pinning. @@ -57,7 +59,7 @@ Content hashes make semver redundant. Instead of trusting a human to correctly l - `sig` changed → Breaking change. The `spore --permit` command is required to explicitly accept the change. - `impl` changed, `sig` unchanged → Internal refactor. `.spore-lock` updates automatically. -A human-readable `version` field remains available in `spore.toml` for documentation purposes, but it plays no role in dependency resolution. Named aliases (e.g., `"std-http-v2"`) provide stable human-readable identifiers that map to specific hashes. +A human-readable release label may remain available in `spore.toml` for documentation purposes, but it plays no role in dependency resolution. Named aliases (e.g., `"std-http-stable"`) provide stable human-readable identifiers that map to specific hashes. ### Why platforms for IO? @@ -77,14 +79,14 @@ Every `.sp` file is exactly one module. The module name is derived from the file The canonical source extension in docs and manifests is `.sp`. Compatibility-only `.spore` handling, where it exists, is secondary to this spelling. -| File Path | Module Name | -|---|---| +| File Path | Module Name | +| ------------------------ | ----------------- | | `src/billing/invoice.sp` | `billing.invoice` | -| `src/auth/token.sp` | `auth.token` | -| `src/utils.sp` | `utils` | -| `src/main.sp` | `main` | +| `src/auth/token.sp` | `auth.token` | +| `src/utils.sp` | `utils` | +| `src/main.sp` | `main` | -There is no `module` keyword — the filesystem **is** the declaration. Current implementations do not require or standardize an additional file-level `#![uses(...)]` ceiling; effect requirements remain attached to individual function signatures. +There is no `module` keyword — the filesystem **is** the declaration. Spore requires effect requirements attached to individual function signatures; no additional file-level `#![uses(...)]` ceiling is standardized. ```spore // src/billing/invoice.sp @@ -114,8 +116,8 @@ Module names use **dot-separated** paths derived from the filesystem. There is n Spore uses **dot-separated** paths for module imports and for item selection: ```spore -// Module import: brings the module into scope; current implementations then -// register its pub items as unqualified local names +// Module import: brings the module into scope; +// registers its pub items as unqualified local names import billing.invoice // Module import with rename @@ -126,13 +128,7 @@ alias gen = billing.invoice.generate_invoice alias Inv = billing.types.Invoice ``` -> **Note (D12):** Selective imports (`import billing.invoice.{calculate, Invoice}`) are not supported in v0.1. Use `import billing.invoice` and call the imported item by its current bare local name, or use `alias` for specific items. -> -> **Current implementation note:** `import mod as alias` is the locked surface -> syntax, but imported items are still registered unqualified in the current -> checker/runtime path. Alias-qualified member access (`inv.generate_invoice()`) -> remains follow-up work; today the safe implementation-aligned path is to -> import the module and call the imported item by its bare local name. +> **Note:** Selective imports (`import billing.invoice.{calculate, Invoice}`) are not part of the accepted surface. Use `import billing.invoice` and call the imported item by its bare local name, or use `alias` for specific items. Key rules: @@ -151,11 +147,11 @@ Key rules: ### Visibility -| Level | Keyword | Meaning | -|---|---|---| -| **Private** | *(default)* | Visible only within the defining module | -| **Package-internal** | `pub(pkg)` | Visible to any module within the same package | -| **Public** | `pub` | Visible to any importer, including external packages | +| Level | Keyword | Meaning | +| -------------------- | ----------- | ---------------------------------------------------- | +| **Private** | _(default)_ | Visible only within the defining module | +| **Package-internal** | `pub(pkg)` | Visible to any module within the same package | +| **Public** | `pub` | Visible to any importer, including external packages | ```spore // Public: any importer can call this @@ -190,11 +186,11 @@ my-billing-lib/ Three package types exist: -| Type | Has Platform? | Can do IO? | Use Case | -|---|---|---|---| -| `package` | No | No (declares effect requirements) | Libraries, reusable components | -| `application` | Yes | Yes (via Platform) | Executables, services | -| `platform` | Is the Platform | Yes (raw syscalls) | Runtime providers | +| Type | Has Platform? | Can do IO? | Use Case | +| ------------- | --------------- | --------------------------------- | ------------------------------ | +| `package` | No | No (declares effect requirements) | Libraries, reusable components | +| `application` | Yes | Yes (via Platform) | Executables, services | +| `platform` | Is the Platform | Yes (raw syscalls) | Runtime providers | ### Script Mode @@ -205,7 +201,7 @@ manifest-backed packages and applications only. ### Platforms -A Platform is the **only** package in a Spore application that bridges declared effects to the host runtime. In the current MVP it contributes: +A Platform is the **only** package in a Spore application that bridges declared effects to the host runtime. It contributes: 1. package modules that expose Platform-owned `foreign fn` surfaces, 2. a contract module that defines the startup signature, and @@ -223,7 +219,7 @@ fn main() -> () uses [Console, Exit] { For `basic-cli`, project mode currently requires `main() -> ()`. Explicit process termination is modeled as a Platform API (`basic_cli.cmd.exit`) requiring `uses [Exit]`; the runtime carries that as a structured project outcome and the CLI converts that outcome to a host exit status at the process boundary. -In the current implementation, project mode also validates the selected entry's +Project mode validates the selected entry's startup effect requirements against the selected Platform package's `[platform].handled-effects` manifest field. A manifest-backed application may only start if its entry function's required effects are listed in that Platform @@ -287,7 +283,7 @@ Illustrative topological build order: 6. main (1 dependency: api.handler) ``` -Diamond dependencies are permitted (they are not cycles). The `.spore-lock` ensures a single canonical version via matching `sig` and `impl` hashes. +Diamond dependencies are permitted (they are not cycles). The `.spore-lock` ensures a single canonical implementation via matching `sig` and `impl` hashes. #### Forward references @@ -297,8 +293,8 @@ Within a single module, functions may reference each other regardless of declara ```text import_decl ::= 'import' module_path ('as' IDENT)? - // NOTE (D12): selective imports ('import' module_path '.' '{' IDENT (',' IDENT)* '}') - // are not supported in v0.1 + // NOTE: selective imports ('import' module_path '.' '{' IDENT (',' IDENT)* '}') + // are not part of the accepted surface alias_decl ::= visibility? 'alias' IDENT '=' qualified_item module_path ::= IDENT ('.' IDENT)* qualified_item ::= module_path '.' IDENT @@ -362,20 +358,20 @@ For partial functions (those containing holes), `impl = None`. When a hole is fi #### What changes each hash -| Change | `sig` changes? | `impl` changes? | -|---|---|---| -| Parameter name: `order` → `purchase_order` | **Yes** | **Yes** | -| Parameter type: `Order` → `OrderRequest` | **Yes** | **Yes** | -| Return type: `Invoice` → `InvoiceResult` | **Yes** | **Yes** | -| Error type set: Add `[ValidationError]` | **Yes** | **Yes** | -| Effects: `pure` → `deterministic` | **Yes** | **Yes** | -| Cost bound: `≤ 3000` → `≤ 5000` | **Yes** | **Yes** | -| Effects: Add `AuditLog` | **Yes** | **Yes** | -| Generic constraints: `T: Eq` → `T: Eq + Hash` | **Yes** | **Yes** | -| Function body: Refactor internals | **No** | **Yes** | -| Hole filling: Replace `?logic` with code | **No** | `None` → concrete | -| Comments: Add/remove/edit | **No** | **No** | -| Formatting: Reformat code | **No** | **No** | +| Change | `sig` changes? | `impl` changes? | +| --------------------------------------------- | -------------- | ----------------- | +| Parameter name: `order` → `purchase_order` | **Yes** | **Yes** | +| Parameter type: `Order` → `OrderRequest` | **Yes** | **Yes** | +| Return type: `Invoice` → `InvoiceResult` | **Yes** | **Yes** | +| Error type set: Add `[ValidationError]` | **Yes** | **Yes** | +| Effects: `pure` → `deterministic` | **Yes** | **Yes** | +| Cost bound: `≤ 3000` → `≤ 5000` | **Yes** | **Yes** | +| Effects: Add `AuditLog` | **Yes** | **Yes** | +| Generic constraints: `T: Eq` → `T: Eq + Hash` | **Yes** | **Yes** | +| Function body: Refactor internals | **No** | **Yes** | +| Hole filling: Replace `?logic` with code | **No** | `None` → concrete | +| Comments: Add/remove/edit | **No** | **No** | +| Formatting: Reformat code | **No** | **No** | Key insight: signature changes always change both hashes. Body-only changes update `impl` while leaving `sig` untouched—this prevents unnecessary downstream recompilation. @@ -431,12 +427,12 @@ The `spore --permit` command updates `sig` entries in `.spore-lock`. Implementat Hashes cascade through three levels, all derived from `sig` only (implementation changes do not propagate): -| Level | What is hashed | When it changes | -|---|---|---| -| **Function `sig`** | Canonical signature | Any signature component changes | -| **Function `impl`** | Compiled AST | Body changes, hole filling, or signature changes | -| **Module interface** | Sorted hash of all `pub` + `pub(pkg)` `sig` hashes | Any exported function's signature changes | -| **Package API** | Sorted hash of module interface hashes | Any module's exported interface changes | +| Level | What is hashed | When it changes | +| -------------------- | -------------------------------------------------- | ------------------------------------------------ | +| **Function `sig`** | Canonical signature | Any signature component changes | +| **Function `impl`** | Compiled AST | Body changes, hole filling, or signature changes | +| **Module interface** | Sorted hash of all `pub` + `pub(pkg)` `sig` hashes | Any exported function's signature changes | +| **Package API** | Sorted hash of module interface hashes | Any module's exported interface changes | ### Visibility rules @@ -542,7 +538,7 @@ alias invoice = billing.types.Invoice // ERROR: 'invoice' conflicts with modu #### IO via Platform boundary Application code declares effect requirements and crosses the runtime -boundary through the selected Platform package. In the current MVP, +boundary through the selected Platform package. manifest-backed project mode resolves those calls through Platform-owned modules plus the Platform's startup contract: @@ -570,7 +566,7 @@ A Platform declares three things: 3. **Host adapter surface**: The package-owned adapter that bridges the application startup into the Platform runtime. -For the current MVP bridge, the source of truth is split between the Platform +The source of truth is split between the Platform package manifest and a dedicated contract module inside the package: ```toml @@ -601,7 +597,7 @@ pub fn main_for_host(app_main: () -> ()) -> () { The hole-backed startup function in the Platform contract module is the authoritative startup signature. Applications targeting that Platform must implement the same startup function name plus the same parameter/return shape in -their entry module. Current MVP effect requirements are checked separately: +their entry module. Effect requirements are checked separately: the selected startup entry's `uses [...]` set must fit within `[platform].handled-effects` rather than being encoded into the adapter's callable Rust-facing shape. Any `spec` items attached to the Platform contract and the application @@ -624,7 +620,7 @@ boundary accepts exit codes in `0..=255`; unsupported values are reported as errors rather than silently truncated. Parser-level `platform { ... }` blocks remain future sugar over the same -package-owned contract; the MVP does not require that syntax to land first. +package-owned contract; this SEP does not require that syntax to land first. Application-side Platform selection in `spore.toml`: @@ -659,7 +655,7 @@ multiple Platforms. Since application code is decoupled from IO implementations, a future Test Platform can substitute mock handlers. The syntax below is illustrative future -sugar rather than the current package-backed MVP surface: +sugar rather than the package-backed surface: ```spore platform TestPlatform { @@ -688,7 +684,7 @@ Running `spore test tests/fetch_weather_test.sp` automatically uses the Test Pla ```bash $ spore test tests/fetch_weather_test.sp - Using test platform: spore-platform/test v1.0.0 + Using test platform: spore-platform/test Running 3 tests test fetch_weather_returns_valid_data ... ok (0.001s) @@ -725,7 +721,7 @@ Package: billing-lib #### Effect checking today -The current implementation-aligned model standardizes effect checking at two +The normative model standardizes effect checking at two places: 1. function signatures (`uses [...]`) @@ -733,7 +729,7 @@ places: Additional project-wide or file-level ceilings such as `[effects].declared` or `#![uses(...)]` remain reserved for follow-up design work. They are not part -of the shipped MVP contract today, and tooling should not treat them as +of the this SEP contract, and tooling should not treat them as normative. #### Effect propagation @@ -747,12 +743,11 @@ Importing a module does **not** grant its effects. If you call a function that r ```toml [package] name = "billing-lib" -version = "1.2.0" # human-readable, not used for resolution +release = "stable" # human-readable, not used for resolution type = "package" # "package" | "application" | "platform" description = "Invoice generation and tax calculation" license = "MIT" authors = ["Alice "] -spore-version = ">=0.5.0" [dependencies] json-parser = { @@ -764,7 +759,7 @@ json-parser = { [dev-dependencies] test-framework = { git = "https://github.com/spore-std/test", - alias = "spore-test-v1", + alias = "spore-test-stable", sig = "a1b2c3d4e5f6a7b8", impl = "1a2b3c4d5e6f7a8b" } @@ -772,7 +767,7 @@ test-framework = { Dependencies are declared **per-project** in `spore.toml`, not per-file. Additional project-wide effect ceilings remain future design work rather than -part of the current normative MVP surface. +part of the normative surface. Package names are `kebab-case` and globally unique within a registry. Module paths within a package use `snake_case` segments. @@ -780,12 +775,12 @@ Package names are `kebab-case` and globally unique within a registry. Module pat An application declares one or more named **entries** in `spore.toml`. Each entry selects an **entry module**, and the selected Platform then validates the -module's **startup function** against its **startup contract**. Under the MVP +module's **startup function** against its **startup contract**. The bridge, the selected Platform package manifest names the contract module and startup symbol, and that contract module owns the hole-backed startup definition whose `spec` items stack with the application's own implementation. -Current CLI behavior also retains a compatibility path for explicit +The CLI supports explicit `spore run src/...` file execution: when the file lives under a discovered project's `src/` tree, the tool derives a project-backed entry path from that file location. Named manifest entries remain the canonical durable project @@ -814,10 +809,10 @@ path = "src/migrate.sp" ``` Each entry path must point to a `.sp` file containing a startup function that -satisfies the selected Platform's startup contract. Under the MVP bridge, the +satisfies the selected Platform's startup contract. The selected Platform package manifest names the contract module and startup symbol, while that contract module owns the hole-backed startup definition and adapter. -`basic-cli` currently uses `main` plus `main_for_host`. +`basic-cli` uses `main` plus `main_for_host`. #### Content-addressed dependencies (BLAKE3, no semver) @@ -861,7 +856,7 @@ Content hashes ensure integrity regardless of source. Community registries provi #### Effect-based trust model Packages declare dependencies and Platform selection in `spore.toml`. Current -implementation-aligned effect control does **not** add a separate +effect control does **not** add a separate project-wide `[effects].declared` ceiling. Instead: - libraries declare required effects on function signatures @@ -870,7 +865,7 @@ project-wide `[effects].declared` ceiling. Instead: - tooling such as `spore audit` reports the resulting dependency/effect graph More granular dependency grants remain future design space rather than part of -the current shipped MVP. +the this SEP contract. #### Workspace support @@ -897,12 +892,12 @@ A workspace uses a **single `.spore-lock`** at the workspace root. Individual me Spore recognizes four dependency types: -| Type | `sig` required? | `impl` required? | When needed | -|---|---|---|---| -| **Interface** (sig-only) | Yes | No | Type-checking without fetching full source | -| **Complete** (sig+impl) | Yes | Yes | Full build, linking, reproducible deployment | -| **Dev** | Yes | Yes | Testing, benchmarking; excluded from production builds | -| **Optional** | Yes | Yes (if enabled) | Controlled by feature flags | +| Type | `sig` required? | `impl` required? | When needed | +| ------------------------ | --------------- | ---------------- | ------------------------------------------------------ | +| **Interface** (sig-only) | Yes | No | Type-checking without fetching full source | +| **Complete** (sig+impl) | Yes | Yes | Full build, linking, reproducible deployment | +| **Dev** | Yes | Yes | Testing, benchmarking; excluded from production builds | +| **Optional** | Yes | Yes (if enabled) | Controlled by feature flags | ```toml [dependencies] @@ -925,7 +920,7 @@ metrics = { sig = "e2f3a4b5c6d7e8f9", impl = "9f0a1b2c3d4e5f6a", optional = true mylib = { git = "https://github.com/user/repo", branch = "main", # optional: track a branch - tag = "v1.2.3", # optional: pin a tag + tag = "stable-release", # optional: pin a tag rev = "abc123def456", # optional: pin a commit sig = "a1b2c3d4e5f6a7b8", impl = "e5f6a7b8c9d0e1f2" @@ -968,11 +963,10 @@ The `.spore-lock` file records the fully resolved dependency graph: ```toml # Auto-generated by `spore lock`. Do not edit. -version = 1 +schema = "current" [metadata] generated-at = "2024-01-15T10:30:00Z" -spore-version = "0.5.2" [[package]] name = "web-server" @@ -982,7 +976,7 @@ impl = "a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6e7f8a9b0c1d2e3f4a5b6c7d8e9f0a1b2" [[package]] name = "http" -alias = "std-http-v2" +alias = "std-http-stable" source = { type = "git", url = "https://github.com/std/http", rev = "abc1234" } sig = "a3f9c2e1d8b7a6f5e4d3c2b1a0f9e8d7c6b5a4f3e2d1c0b9a8f7e6d5c4b3a2f1" impl = "7b8d4f2a1e9c3d5b6a7f8e9d0c1b2a3f4e5d6c7b8a9f0e1d2c3b4a5f6e7d8c9" @@ -1022,7 +1016,7 @@ platform = "wasm32" packages = ["json"] [build-metadata] -compiler-version = "spore-0.5.2" +compiler-id = "spore" build-timestamp = "2024-01-15T10:30:10Z" ``` @@ -1076,7 +1070,7 @@ pub trait StorageBackend { **Cache strategy**: -- **Content-addressed**: Identical hashes share storage. Two packages using the same dependency version store it once. +- **Content-addressed**: Identical hashes share storage. Two packages using the same dependency content store it once. - **Global shared**: By default `~/.spore-store` is shared across all projects. Configurable via `~/.sp/config.toml`. - **Offline-capable**: Once cached, dependencies are available without network access. @@ -1091,7 +1085,7 @@ GC algorithm: (1) scan all `.spore-lock` files in known project directories, (2) ### Fine-grained effects (design direction) -For v0.1, Spore uses **coarse-grained** effect names in the language-level type system: +The accepted surface uses **coarse-grained** effect names in the language-level type system: ```spore fn read_config() -> Config ! IoError @@ -1123,7 +1117,7 @@ network: filesystem: env: process: monotonic fast ``` -For v0.1, the language-level `uses [FileRead]` maps to coarse categories. Fine-grained path restrictions are enforced in `spore.toml` and at runtime via platform policy / host configuration, not in the type system. Promoting fine-grained effects into the type system is reserved for a future SEP. +For the accepted surface, the language-level `uses [FileRead]` maps to coarse categories. Fine-grained path restrictions are enforced in `spore.toml` and at runtime via platform policy / host configuration, not in the type system. Promoting fine-grained effects into the type system is reserved for a future SEP. ### Publish and discovery flow @@ -1134,8 +1128,8 @@ Spore has **no central registry**. Publishing a package means pushing to a Git r spore test && spore audit # 2. Commit and tag -git add . && git commit -m "Release v1.0.0" -git tag v1.0.0 +git add . && git commit -m "Release stable" +git tag stable-release # 3. Push git push origin main --tags @@ -1202,12 +1196,12 @@ Module: billing.invoice Snapshots use `sig` hashes only at the module and package level (impl changes do not propagate): -| Level | What is hashed | When it changes | -|---|---|---| -| Function `sig` | Canonical signature | Any signature component changes | -| Function `impl` | Compiled AST | Body changes, hole filling | -| Module interface | Sorted hash of `pub` + `pub(pkg)` `sig` hashes | Any exported signature changes | -| Package API | Sorted hash of module interface hashes | Any module interface changes | +| Level | What is hashed | When it changes | +| ---------------- | ---------------------------------------------- | ------------------------------- | +| Function `sig` | Canonical signature | Any signature component changes | +| Function `impl` | Compiled AST | Body changes, hole filling | +| Module interface | Sorted hash of `pub` + `pub(pkg)` `sig` hashes | Any exported signature changes | +| Package API | Sorted hash of module interface hashes | Any module interface changes | #### Interaction with the Error System @@ -1321,24 +1315,23 @@ Build: success (2 partial functions) #### Comparison with other approaches -| Approach | IO Model | Testability | Platform Independence | Example | -|---|---|---|---|---| -| **Traditional** | Built-in IO | Needs mocking | Low | C, Go, Java | -| **Monadic IO** | IO Monad | Medium | Low | Haskell | -| **Effect System** | Algebraic Effects | High | Medium | Koka, Eff | -| **Spore Platform** | Effect Handlers + Platform | **Very high** | **Very high** | Roc, Spore | +| Approach | IO Model | Testability | Platform Independence | Example | +| ------------------ | -------------------------- | ------------- | --------------------- | ----------- | +| **Traditional** | Built-in IO | Needs mocking | Low | C, Go, Java | +| **Monadic IO** | IO Monad | Medium | Low | Haskell | +| **Effect System** | Algebraic Effects | High | Medium | Koka, Eff | +| **Spore Platform** | Effect Handlers + Platform | **Very high** | **Very high** | Roc, Spore | Spore and Roc share the same design philosophy: no built-in Platform; all Platforms are third-party packages. The examples below use illustrative future `platform { ... }` sugar. The -current MVP surface remains package-backed manifests plus contract modules. +package surface remains package-backed manifests plus contract modules. #### Web Platform example ```spore platform WebPlatform { - version: "1.0.0" handles [NetListen, Clock, Spawn, DbQuery] startup: fn(req: Request) -> Response ! NetListen | DbQuery @@ -1354,7 +1347,6 @@ platform WebPlatform { ```spore platform LambdaPlatform { - version: "1.0.0" handles [NetConnect, S3Read, S3Write, DynamoRead, DynamoWrite] startup: fn(event: JsonValue) -> JsonValue ! S3Read | DynamoWrite @@ -1486,7 +1478,6 @@ spore init my-platform --type platform ```spore platform EmbeddedPlatform { - version: "0.1.0" handles [GpioRead, GpioWrite, Timer, SerialRead, SerialWrite] startup: fn() -> Never ! GpioRead | GpioWrite | Timer @@ -1532,13 +1523,13 @@ Platforms are content-addressed packages like any other, following the same `sig #### Platform subsystem interactions -| Subsystem | Interaction | -|---|---| -| **Concurrency** | `Spawn` is an effect surfaced by Platform-owned modules; runtimes may back it with a thread pool, tokio, or another scheduler | -| **Package management** | Platforms are ordinary Spore packages, fetched and cached via the same content-addressed system | -| **Effect system** | Platform defines the manifest-declared effect contract. Current implementations verify the selected startup entry's required effects against `[platform].handled-effects`; broader whole-application enforcement remains follow-up work | -| **Cost model** | Platform effects carry cost annotations: `File.read @cost(io=1)`. The compiler uses these for cost analysis | -| **Compiler** | Resolves Platform package metadata and contract modules into the compiled runtime boundary | +| Subsystem | Interaction | +| ---------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| **Concurrency** | `Spawn` is an effect surfaced by Platform-owned modules; runtimes may back it with a thread pool, tokio, or another scheduler | +| **Package management** | Platforms are ordinary Spore packages, fetched and cached via the same content-addressed system | +| **Effect system** | Platform defines the manifest-declared effect contract. The selected startup entry's required effects against `[platform].handled-effects`; broader whole-application enforcement remains follow-up work | +| **Cost model** | Platform effects carry cost annotations: `File.read @cost(io=1)`. The compiler uses these for cost analysis | +| **Compiler** | Resolves Platform package metadata and contract modules into the compiled runtime boundary | #### Platform API reference @@ -1576,14 +1567,14 @@ pub foreign fn exit(code: U8) -> Never uses [Exit] **Comparison with Roc, Elm, and Koka**: -| Feature | Roc | Elm | Koka | Spore | -|---|---|---|---|---| -| Platform concept | ✓ | Partial (runtime) | ✗ | ✓ | -| Effect system | ✗ (tag unions) | ✗ (Cmd/Sub) | ✓ | ✓ | -| One Platform / project | ✗ | ✗ | N/A | ✓ | -| Test Platform | Mock Platform | Mock Cmd | Manual mock | Test Platform | -| Startup contract | Platform-defined | Fixed (main) | Fixed (main) | Platform-defined | -| Concurrency | Platform-provided | Built-in runtime | Effect handler | Effect handler | +| Feature | Roc | Elm | Koka | Spore | +| ---------------------- | ----------------- | ----------------- | -------------- | ---------------- | +| Platform concept | ✓ | Partial (runtime) | ✗ | ✓ | +| Effect system | ✗ (tag unions) | ✗ (Cmd/Sub) | ✓ | ✓ | +| One Platform / project | ✗ | ✗ | N/A | ✓ | +| Test Platform | Mock Platform | Mock Cmd | Manual mock | Test Platform | +| Startup contract | Platform-defined | Fixed (main) | Fixed (main) | Platform-defined | +| Concurrency | Platform-provided | Built-in runtime | Effect handler | Effect handler | ### CLI command reference @@ -1713,12 +1704,12 @@ Two libraries depend on the same `utils` package with the same `sig` but differe $ spore deps my-app ├── lib-a@sig:aaa+impl:bbb -│ └── utils@sig:v1+impl:v1 +│ └── utils@sig:same+impl:alpha └── lib-b@sig:ccc+impl:ddd - └── utils@sig:v1+impl:v2 + └── utils@sig:same+impl:beta ``` -Both versions coexist. If `my-app` directly depends on `utils`, it must choose explicitly: `spore add utils --impl v1`. +Both implementations coexist. If `my-app` directly depends on `utils`, it must choose explicitly: `spore add utils --impl alpha`. #### Scenario 4: Local monorepo development @@ -1740,30 +1731,30 @@ spore build ```bash spore test && spore audit spore hash # record sig + impl hashes -git add . && git commit -m "Release v1.0.0" -git tag v1.0.0 +git add . && git commit -m "Release stable" +git tag stable-release git push origin main --tags # Update README with spore.toml snippet including hashes ``` ### Glossary -| Term | Definition | -|---|---| -| **Content-addressed** | Identifying resources by the hash of their content rather than by name or location | -| **Signature hash (`sig`)** | BLAKE3 hash of a function's public contract (params, return type, errors, effects, cost) | -| **Implementation hash (`impl`)** | BLAKE3 hash of a function's compiled AST; `None` for partial functions | -| **Named alias** | A human-readable identifier mapping to a specific set of content hashes | -| **Dependency graph** | DAG of module/package import relationships | -| **Effect** | A declared permission for a function to perform a specific category of operation | -| **Effect ceiling** | A reserved follow-up concept for module- or project-level effect caps; not part of the shipped MVP contract today | -| **Lock file (`.spore-lock`)** | Machine-generated file recording the full resolved dependency graph with hashes | -| **Diamond dependency** | When the same module is reached via multiple paths in the dependency graph | -| **Reproducible build** | A build that produces identical output from identical inputs (ensured by `impl` hashes) | -| **Platform** | A Spore package that declares startup metadata/contracts and exposes the host-facing modules used by an application | -| **Effect handler** | A runtime host implementation detail that may sit behind Platform package surfaces; not the primary package contract model | -| **Hole** | A placeholder (`?name`) in a function body, marking incomplete code | -| **Partial function** | A function containing holes; has `impl = None` until all holes are filled | +| Term | Definition | +| -------------------------------- | -------------------------------------------------------------------------------------------------------------------------- | +| **Content-addressed** | Identifying resources by the hash of their content rather than by name or location | +| **Signature hash (`sig`)** | BLAKE3 hash of a function's public contract (params, return type, errors, effects, cost) | +| **Implementation hash (`impl`)** | BLAKE3 hash of a function's compiled AST; `None` for partial functions | +| **Named alias** | A human-readable identifier mapping to a specific set of content hashes | +| **Dependency graph** | DAG of module/package import relationships | +| **Effect** | A declared permission for a function to perform a specific category of operation | +| **Effect ceiling** | A reserved follow-up concept for module- or project-level effect caps; not part of the this SEP contract | +| **Lock file (`.spore-lock`)** | Machine-generated file recording the full resolved dependency graph with hashes | +| **Diamond dependency** | When the same module is reached via multiple paths in the dependency graph | +| **Reproducible build** | A build that produces identical output from identical inputs (ensured by `impl` hashes) | +| **Platform** | A Spore package that declares startup metadata/contracts and exposes the host-facing modules used by an application | +| **Effect handler** | A runtime host implementation detail that may sit behind Platform package surfaces; not the primary package contract model | +| **Hole** | A placeholder (`?name`) in a function body, marking incomplete code | +| **Partial function** | A function containing holes; has `impl = None` until all holes are filled | ## Human experience impact @@ -1820,7 +1811,7 @@ The `.spore-lock` file is a TOML document that encodes the full dependency graph ```toml [[package]] name = "http" -alias = "std-http-v2" +alias = "std-http-stable" source = { type = "git", url = "https://github.com/std/http", rev = "abc1234" } sig = "a3f9c2e1d8b7a6f5" impl = "7b8d4f2a1e9c3d5b" @@ -1845,7 +1836,7 @@ For interface-only dependencies, the compiler can produce `.spore-sig` files con ### Platform binding records The compiler resolves the selected Platform package's metadata, startup contract, -and imported host-facing modules into the compiled output. Current MVP behavior +and imported host-facing modules into the compiled output. The centers startup-contract validation plus the selected Platform package boundary; broader whole-program effect routing remains follow-up work. @@ -1858,17 +1849,17 @@ of introducing a separate `E3xxx` / `E4xxx` namespace. Older draft placeholders in those ranges are retired in favor of the shared `M0xxx` and `C0xxx` registries. -| Code | Category | Example | -|---|---|---| -| `M0201` | Visibility violation | Accessing a private function from another module | -| `M0101` | Circular dependency | Module A → B → A | -| `C0101` | Effect declaration mismatch | Function calls an operation requiring `FileWrite` without declaring `uses [FileWrite]` | -| `M0401` | Signature change detected | `sig` hash mismatch in `.spore-lock` | -| `M0204` | Alias chain | `pub alias` pointing to another alias | -| `M0304` | Shadowing conflict | Import alias conflicts with module name | -| `M0501` | Platform binding conflict | Project declares more than one Platform binding | -| `C0201` | Unsupported startup effect | Selected startup entry requires an effect not listed in the Platform's `[platform].handled-effects` metadata | -| `M0502` | Startup contract mismatch | Startup function signature doesn't match Platform requirement | +| Code | Category | Example | +| ------- | --------------------------- | ------------------------------------------------------------------------------------------------------------ | +| `M0201` | Visibility violation | Accessing a private function from another module | +| `M0101` | Circular dependency | Module A → B → A | +| `C0101` | Effect declaration mismatch | Function calls an operation requiring `FileWrite` without declaring `uses [FileWrite]` | +| `M0401` | Signature change detected | `sig` hash mismatch in `.spore-lock` | +| `M0204` | Alias chain | `pub alias` pointing to another alias | +| `M0304` | Shadowing conflict | Import alias conflicts with module name | +| `M0501` | Platform binding conflict | Project declares more than one Platform binding | +| `C0201` | Unsupported startup effect | Selected startup entry requires an effect not listed in the Platform's `[platform].handled-effects` metadata | +| `M0502` | Startup contract mismatch | Startup function signature doesn't match Platform requirement | ### Diagnostic structure @@ -1972,31 +1963,31 @@ Go's `go.sum` file shows that content hashing for dependency verification is mai ### From semantic versioning -Existing ecosystems using semver can migrate by computing content hashes for each versioned release: +Existing ecosystems using semver can migrate by computing content hashes for each labeled release: ```bash -# Migration tool computes hashes from existing version tags +# Migration tool computes hashes from existing release tags spore migrate --from package.json --to spore.toml ``` The tool: 1. Reads the existing manifest. -2. Resolves Git tags corresponding to version numbers. +2. Resolves Git tags corresponding to release labels. 3. Computes BLAKE3 `sig` and `impl` hashes for each dependency. 4. Generates `spore.toml` with hash-based declarations. -The human-readable `version` field is preserved for documentation but plays no role in resolution. +The human-readable release label is preserved for documentation but plays no role in resolution. -### Lock file versioning +### Lock file schema evolution -The `.spore-lock` file carries a `version` field. Old tools refuse to process newer lock file versions. Migration scripts handle format upgrades: +The `.spore-lock` file carries a schema marker. Old tools refuse to process newer schema shapes. Migration scripts handle format upgrades: ```bash -spore lock upgrade --from 1 --to 2 +spore lock upgrade ``` -New fields added to `.spore-lock` are ignored by older tools (forward-compatible). Removal or semantic changes to existing fields trigger a version increment (backward-incompatible). +New fields added to `.spore-lock` are ignored by older tools when possible. Removal or semantic changes to existing fields require an explicit schema migration. ### Incremental adoption @@ -2004,29 +1995,38 @@ The module system does not require wholesale adoption: - Packages can adopt explicit `uses [...]` signatures incrementally at function boundaries; broader file- or project-level ceilings remain future work. - Dependencies can mix Git, local path, and registry sources. -- The `version` field in `spore.toml` remains available for human communication during the transition from semver. +- A human-readable release label in `spore.toml` remains available during the transition from semver. ## Unresolved questions 1. **Hash truncation for display**: How many hex characters should be shown in human-facing output? 8? 16? Full 64? Should there be a configurable default? -2. **~~Cross-package `pub(pkg)` boundaries in workspaces~~**: Resolved — `pub(pkg)` restricts visibility to the current package (member) only. Items must be promoted to `pub` for cross-member access. +2. **Effect granularity**: The current design uses coarse-grained effect names (e.g., `FileRead`, `NetConnect`). Should fine-grained effects (e.g., `filesystem:read:/data`) be part of the language-level type system or remain a tooling concern in `spore.toml`? -3. **Effect granularity**: The current design uses coarse-grained effect names (e.g., `FileRead`, `NetConnect`). Should fine-grained effects (e.g., `filesystem:read:/data`) be part of the language-level type system or remain a tooling concern in `spore.toml`? +3. **Platform evolution**: Platforms themselves evolve. When a Platform changes its handler interface, how are downstream applications notified? Is `spore --permit` sufficient, or do Platforms need a separate compatibility mechanism? -4. **Platform versioning**: Platforms themselves evolve. When a Platform changes its handler interface, how are downstream applications notified? Is `spore --permit` sufficient, or do Platforms need a separate compatibility mechanism? +4. **Module-level cost budgets**: The current design tracks cost per-function only. Should modules declare aggregate cost ceilings? What are the semantics when a module's total cost exceeds a declared budget? -5. **Module-level cost budgets**: The current design tracks cost per-function only. Should modules declare aggregate cost ceilings? What are the semantics when a module's total cost exceeds a declared budget? +5. **Diamond dependencies with different `impl` hashes**: When two transitive dependencies require the same module with the same `sig` but different `impl` hashes, the current design allows both to coexist. Should the linker deduplicate? Should the developer be warned? What are the binary size implications? -6. **Alias chain depth**: Alias chains (alias → alias) are currently forbidden. Should single-level chains be permitted for ergonomic re-exports, or does the restriction stand? +6. **Effect handler composition**: When an application-defined effect maps to Platform effects (e.g., `Logger` → `StdOut`/`StdErr`), how are effect requirements tracked through the composition? Is the current `uses [...]` propagation sufficient? -7. **Diamond dependencies with different `impl` hashes**: When two transitive dependencies require the same module with the same `sig` but different `impl` hashes, the current design allows both to coexist. Should the linker deduplicate? Should the developer be warned? What are the binary size implications? +7. **Conditional compilation**: The `[features]` mechanism in `spore.toml` enables optional dependencies, but the interaction between features and content hashes is underspecified. Does enabling a feature change the `sig` hash? -8. **Effect handler composition**: When an application-defined effect maps to Platform effects (e.g., `Logger` → `StdOut`/`StdErr`), how are effect requirements tracked through the composition? Is the current `uses [...]` propagation sufficient? +8. **Offline-first workflows**: How should `spore add` and `spore update` behave when offline? Should the local cache be sufficient for all operations, or are some operations inherently online? -9. **Conditional compilation**: The `[features]` mechanism in `spore.toml` enables optional dependencies, but the interaction between features and content hashes is underspecified. Does enabling a feature change the `sig` hash? +### Resolved questions -10. **Offline-first workflows**: How should `spore add` and `spore update` behave when offline? Should the local cache be sufficient for all operations, or are some operations inherently online? +1. **File-to-module mapping**: Resolved by SEP-0001 and this SEP. Each `.sp` + file is one module, the module path is derived from the filesystem, and + there is no `module` keyword in the accepted surface. + +2. **Cross-package `pub(pkg)` boundaries in workspaces**: Resolved — + `pub(pkg)` restricts visibility to the current package/member only. Items + must be promoted to `pub` for cross-member access. + +3. **Alias chain depth**: Resolved for the accepted surface. Alias chains remain forbidden; a + `pub alias` must point directly to the original item. --- @@ -2038,7 +2038,7 @@ effect ::= UPPER_IDENT // Imports and aliases import_decl ::= 'import' module_path ('as' IDENT)? - // NOTE (D12): selective imports not supported in v0.1 + // NOTE: selective imports are not part of the accepted surface alias_decl ::= visibility? 'alias' IDENT '=' qualified_item module_path ::= IDENT ('.' IDENT)* qualified_item ::= module_path '.' IDENT @@ -2053,7 +2053,7 @@ fn_decl ::= visibility? 'fn' IDENT generic_params? '(' params ')' '->' ty generic_params ::= '[' IDENT (',' IDENT)* ']' params ::= (param (',' param)*)? param ::= IDENT ':' type -error_clause ::= '!' '[' type (',' type)* ']' +error_clause ::= '!' type ('|' type)* cost_clause ::= 'cost' '[' expr ',' expr ',' expr ',' expr ']' uses_clause ::= 'uses' cap_list @@ -2065,8 +2065,7 @@ field ::= IDENT ':' type // Platform declarations platform_decl ::= 'platform' IDENT '{' platform_body '}' -platform_body ::= ('version:' STRING)? - ('handles' cap_list) +platform_body ::= ('handles' cap_list) ('startup:' type) handler_decl* @@ -2079,12 +2078,12 @@ effect_decl ::= 'effect' IDENT '{' effect_fn* '}' effect_fn ::= 'fn' IDENT '(' params ')' '->' type // spore.toml (TOML subset) -// [package] name, version, type, description, license, authors, spore-version, default-entry +// [package] name, release, type, description, license, authors, default-entry // [dependencies] = { git|path|alias, sig, impl?, optional?, branch?, tag?, rev? } // [dev-dependencies] // [features] = [deps/features] // [entries.] path = STRING -// [platform] git|path, version?, handles? +// [platform] git|path, handles? // [overrides] = { sig, impl } // [build] script // [metadata] arbitrary key-value @@ -2094,45 +2093,45 @@ effect_fn ::= 'fn' IDENT '(' params ')' '->' type ### Module commands -| Command | Description | -|---|---| -| `spore check [path...]` | Check one or more source / entry files | -| `spore build [path]` | Build the current project or one explicit file | -| `spore test [path...]` | Validate test / spec files | -| `spore fmt [path...]` | Format source code (including import ordering) | -| `sporec compile [path...]` | Compile explicit input files | -| `sporec holes [path]` | List all holes in a source file | -| `sporec query-hole [path] [id]` | Inspect one named hole in a source file | -| `sporec explain [code]` | Explain one diagnostic code | +| Command | Description | +| ------------------------------- | ---------------------------------------------- | +| `spore check [path...]` | Check one or more source / entry files | +| `spore build [path]` | Build the current project or one explicit file | +| `spore test [path...]` | Validate test / spec files | +| `spore fmt [path...]` | Format source code (including import ordering) | +| `sporec compile [path...]` | Compile explicit input files | +| `sporec holes [path]` | List all holes in a source file | +| `sporec query-hole [path] [id]` | Inspect one named hole in a source file | +| `sporec explain [code]` | Explain one diagnostic code | ### Snapshot and hash commands -| Command | Description | -|---|---| -| `spore snapshot` | Show function, module, and package hashes | -| `spore hash` | Compute current module/function hashes | -| `spore --permit ` | Accept a signature change, update `.spore-lock` | -| `spore --permit --all` | Accept all pending signature changes | -| `spore exports ` | List a module's public API with hashes | -| `spore cost-report ` | Show cost summary for a module | +| Command | Description | +| ------------------------- | ----------------------------------------------- | +| `spore snapshot` | Show function, module, and package hashes | +| `spore hash` | Compute current module/function hashes | +| `spore --permit ` | Accept a signature change, update `.spore-lock` | +| `spore --permit --all` | Accept all pending signature changes | +| `spore exports ` | List a module's public API with hashes | +| `spore cost-report ` | Show cost summary for a module | ### Package management commands -| Command | Description | -|---|---| -| `spore init [name]` | Initialize new project | -| `spore add ` | Add a dependency | -| `spore remove ` | Remove a dependency | -| `spore update [name]` | Update dependency hashes | -| `spore lock` | Generate or update `.spore-lock` | -| `spore lock --verify` | Verify `.spore-lock` integrity | -| `spore fetch` | Download dependencies to local cache | +| Command | Description | +| --------------------- | ------------------------------------ | +| `spore init [name]` | Initialize new project | +| `spore add ` | Add a dependency | +| `spore remove ` | Remove a dependency | +| `spore update [name]` | Update dependency hashes | +| `spore lock` | Generate or update `.spore-lock` | +| `spore lock --verify` | Verify `.spore-lock` integrity | +| `spore fetch` | Download dependencies to local cache | ### Audit and inspection commands -| Command | Description | -|---|---| -| `spore audit` | Audit package effects, hashes, cycles | -| `spore deps` | Show dependency tree | -| `spore deps --reverse ` | Show reverse dependencies | -| `spore gc` | Clean unused cache entries | +| Command | Description | +| -------------------------- | ------------------------------------- | +| `spore audit` | Audit package effects, hashes, cycles | +| `spore deps` | Show dependency tree | +| `spore deps --reverse ` | Show reverse dependencies | +| `spore gc` | Clean unused cache entries | diff --git a/seps/SEP-0009-standard-library.md b/seps/SEP-0009-standard-library.md index 5030cb9..4b30673 100644 --- a/seps/SEP-0009-standard-library.md +++ b/seps/SEP-0009-standard-library.md @@ -33,11 +33,9 @@ implementation, and ambiguous cases use trait-qualified syntax. The stdlib is deliberately small. Spore follows the principle that the standard library should provide exactly the types and functions needed to write idiomatic Spore code, with everything else available as packages. -> **Target-surface note**: This SEP still contains historical `std.*` examples -> in places. For the approved compositional-semantics wave, the engineering- -> facing helper naming is expected to use `spore.combine`, `spore.merge`, -> `spore.order`, and `spore.laws`, and explicitly does **not** introduce a -> user-facing `std.algebra` surface. +The standard-library naming uses `std.*` for core modules and `spore.*` for +algebraic helper surfaces (for example, `spore.combine`, `spore.merge`, +`spore.order`, and `spore.laws`). No `std.algebra` surface is introduced. ## Motivation @@ -94,9 +92,8 @@ import std.json // json_parse, json_to_string ### How do I/O functions work? -The current implementation does **not** yet ship a stabilized `std.io` -abstraction layer. Manifest-backed projects import the selected Platform -package's modules directly, and those modules expose `foreign fn` declarations +Manifest-backed projects import the selected Platform +package's modules directly. Those modules expose `foreign fn` declarations with explicit `uses [...]` requirements. ```spore @@ -109,18 +106,16 @@ fn main() -> () uses [Console, Exit] { } ``` -For `basic-cli`, the shipped surface currently lives in modules such as +For `basic-cli`, the surface lives in modules such as `basic_cli.stdout`, `basic_cli.stdin`, `basic_cli.file`, `basic_cli.env`, and `basic_cli.cmd`. A future `std.io` layer may wrap or re-export a common surface above those -Platform packages, but that is not today's contract. +Platform packages. -There is also a narrower runtime path for declared effects in interpreter-style -execution: `perform Effect.operation(...)` falls back through registered host -handlers by the **qualified** `Effect.operation` key. Today that support is -explicit rather than generic (for example, the CLI runtime supports -`Console.print`, `Console.println`, and `Console.read_line`). Platform package -modules remain the main shipped application-facing contract. +For declared effects in interpreter-style +execution, `perform Effect.operation(...)` falls back through registered host +handlers by the **qualified** `Effect.operation` key. Platform package +modules remain the primary application-facing contract. ## Reference-level explanation @@ -130,16 +125,16 @@ The prelude is implicitly imported into every module. It contains: #### Primitive types -| Type | Description | Default | Size / notes | -|------|-------------|---------|----------------| -| `I8`…`I64`, `U8`…`U64` | Fixed-width integers | `0` for the chosen width | 1–8 bytes | -| `F32`, `F64` | IEEE-754 floats | `0.0` | 4 or 8 bytes | -| `Bool` | Boolean | `false` | 1 byte | -| `Str` | Immutable UTF-8 string | `""` | Variable | -| `Unit` | Zero-valued type | `()` | 0 bytes | -| `Never` | Bottom type (uninhabited) | — | 0 bytes | +| Type | Description | Default | Size / notes | +| ---------------------- | ------------------------- | ------------------------ | ------------ | +| `I8`…`I64`, `U8`…`U64` | Fixed-width integers | `0` for the chosen width | 1–8 bytes | +| `F32`, `F64` | IEEE-754 floats | `0.0` | 4 or 8 bytes | +| `Bool` | Boolean | `false` | 1 byte | +| `Str` | Immutable UTF-8 string | `""` | Variable | +| `Unit` | Zero-valued type | `()` | 0 bytes | +| `Never` | Bottom type (uninhabited) | — | 0 bytes | -**No `Char`.** Unicode scalars are represented as `Str` values (typically length 1). Character predicates and conversions live in `stdlib/char.sp` (`is_digit`, `char_to_int`, …). This matches `spore` PR #113. +**No `Char`.** Unicode scalars are represented as `Str` values (typically length 1). Character predicates and conversions live in `stdlib/char.sp` (`is_digit`, `char_to_int`, …). This matches the language specification. **Literals.** In the reference type checker, unsuffixed integer literals default to **`I64`** and floating-point literals default to **`F64`**. Standard-library signatures use explicit fixed-width names; `Int` and `Float` are not standard aliases. @@ -438,20 +433,18 @@ impl[T] Set[T] where T: Hash + Eq { } ``` -#### `std.io` (future standardization layer) +#### `std.io` (platform I/O) -`std.io` is not yet the normative surface of the current implementation. Today, -manifest-backed projects import Platform package modules directly. For -`basic-cli`, the shipped modules are: +`std.io` is a future standardization layer. Manifest-backed projects import Platform package modules directly. For `basic-cli`, the platform modules are: -| Module | Example operations | Required effects | -|---|---|---| -| `basic_cli.stdout` | `print`, `println`, `eprint`, `eprintln` | `Console` | -| `basic_cli.stdin` | `read_line` | `Console` | -| `basic_cli.file` | `file_read`, `file_write`, `file_exists`, `file_stat` | `FileRead`, `FileWrite` | -| `basic_cli.dir` | `dir_list`, `dir_mkdir` | `FileRead`, `FileWrite` | -| `basic_cli.env` | `env_get`, `env_set` | `Env` | -| `basic_cli.cmd` | `process_run`, `process_run_status`, `exit` | `Spawn`, `Exit` | +| Module | Example operations | Required effects | +| ------------------ | ----------------------------------------------------- | ----------------------- | +| `basic_cli.stdout` | `print`, `println`, `eprint`, `eprintln` | `Console` | +| `basic_cli.stdin` | `read_line` | `Console` | +| `basic_cli.file` | `file_read`, `file_write`, `file_exists`, `file_stat` | `FileRead`, `FileWrite` | +| `basic_cli.dir` | `dir_list`, `dir_mkdir` | `FileRead`, `FileWrite` | +| `basic_cli.env` | `env_get`, `env_set` | `Env` | +| `basic_cli.cmd` | `process_run`, `process_run_status`, `exit` | `Spawn`, `Exit` | ```spore pub foreign fn process_run(cmd: Str, args: List[Str]) -> Str ! ExecError uses [Spawn] @@ -459,7 +452,7 @@ pub foreign fn process_run_status(cmd: Str, args: List[Str]) -> Option[U8] ! Exe pub foreign fn exit(code: U8) -> Never uses [Exit] ``` -`basic_cli.cmd.exit(code)` is the currently shipped explicit process-termination +`basic_cli.cmd.exit(code)` is the explicit process-termination surface. In project mode it works together with SEP-0008's startup contract (`main() -> ()`), and the runtime converts the resulting structured outcome into a host exit status. @@ -590,25 +583,25 @@ All error types implement `Error`, `Display`, and `Debug`. These 13 traits have special compiler support (SEP-0002). The stdlib provides their definitions: -| Trait | Methods | Derivable | Description | -|-------|---------|-----------|-------------| -| `Eq` | `eq(self, other) -> Bool` | ✅ | Equality comparison | -| `Ord` | `compare(self, other) -> Ordering` | ✅ | Total ordering | -| `Clone` | `clone(self) -> Self` | ✅ | Deep copy | -| `Display` | `display(self) -> Str` | ❌ | Human-readable formatting | -| `Debug` | `debug(self) -> Str` | ✅ | Debug formatting | -| `Hash` | `hash(self) -> U64` | ✅ | Hash computation | -| `Default` | `default() -> Self` | ✅ | Default value | -| `Serialize` | `serialize(self) -> List[U8]` | ✅ | Byte serialization | -| `Deserialize` | `deserialize(bytes: List[U8]) -> Result[Self, ParseError]` | ✅ | Byte deserialization | -| `Add` | `add(self, other) -> Self` | ❌ | `+` operator | -| `Sub` | `sub(self, other) -> Self` | ❌ | `-` operator | -| `Mul` | `mul(self, other) -> Self` | ❌ | `*` operator | -| `Div` | `div(self, other) -> Self` | ❌ | `/` operator | +| Trait | Methods | Derivable | Description | +| ------------- | ---------------------------------------------------------- | --------- | ------------------------- | +| `Eq` | `eq(self, other) -> Bool` | ✅ | Equality comparison | +| `Ord` | `compare(self, other) -> Ordering` | ✅ | Total ordering | +| `Clone` | `clone(self) -> Self` | ✅ | Deep copy | +| `Display` | `display(self) -> Str` | ❌ | Human-readable formatting | +| `Debug` | `debug(self) -> Str` | ✅ | Debug formatting | +| `Hash` | `hash(self) -> U64` | ✅ | Hash computation | +| `Default` | `default() -> Self` | ✅ | Default value | +| `Serialize` | `serialize(self) -> List[U8]` | ✅ | Byte serialization | +| `Deserialize` | `deserialize(bytes: List[U8]) -> Result[Self, ParseError]` | ✅ | Byte deserialization | +| `Add` | `add(self, other) -> Self` | ❌ | `+` operator | +| `Sub` | `sub(self, other) -> Self` | ❌ | `-` operator | +| `Mul` | `mul(self, other) -> Self` | ❌ | `*` operator | +| `Div` | `div(self, other) -> Self` | ❌ | `/` operator | ### 4.6 Platform binding architecture -The current runtime stack for platform-backed I/O looks like this: +The platform-backed I/O stack looks like this: ```text ┌──────────────────────────────────────────────┐ @@ -639,7 +632,7 @@ Key properties: 2. **Each foreign surface** declares explicit effect requirements. 3. **The selected Platform package** owns startup contracts and handled-effect metadata. 4. **Some operations** may propagate structured runtime outcomes before host conversion (for example `Exit` in project mode). -5. **Future stdlib wrappers** may layer above this, but they are not required for the current implementation. +5. **Future stdlib wrappers** may layer above this. ### 4.7 Cost expression inputs @@ -647,13 +640,13 @@ The cost annotation language (SEP-0004) does not call ordinary stdlib functions inside `cost [...]`. Verified costs mention only Index parameters and `cost(f)` summaries: -| Source | Example | Notes | -|--------|---------|-------| -| Index parameter | `N: Index` | Compile-time non-negative size symbol | -| Indexed count | `Count[N]` | Runtime count whose static Index is `N` | -| Indexed container | `Array[T, N]`, `Vec[T, max: N]` | Exposes `N` to CostExpr | -| Index operation | `max(N, M)`, `min(N, M)`, `span(Hi, Lo)` | Pure IndexExpr, not runtime function calls | -| Function summary | `cost(f)` | Substituted when `f` is concrete at the call site | +| Source | Example | Notes | +| ----------------- | ---------------------------------------- | ------------------------------------------------- | +| Index parameter | `N: Index` | Compile-time non-negative size symbol | +| Indexed count | `Count[N]` | Runtime count whose static Index is `N` | +| Indexed container | `Array[T, N]`, `Vec[T, max: N]` | Exposes `N` to CostExpr | +| Index operation | `max(N, M)`, `min(N, M)`, `span(Hi, Lo)` | Pure IndexExpr, not runtime function calls | +| Function summary | `cost(f)` | Substituted when `f` is concrete at the call site | Dynamic methods such as `list.len()`, `s.len()`, `map.len()`, or `set.len()` return runtime integers. They are useful program APIs, but they are not @@ -692,14 +685,17 @@ The stdlib is represented in the module system as: "trait": "Mappable", "name": "map", "type_params": ["A", "B", "N: Index"], - "params": [{"name": "self", "type": "Vec[A, max: N]"}, {"name": "f", "type": "(A) -> B"}], + "params": [ + { "name": "self", "type": "Vec[A, max: N]" }, + { "name": "f", "type": "(A) -> B" } + ], "return_type": "Vec[B, max: N]", "cost": "N * cost(f) + N", "effects": [] } ], "types": [ - {"name": "Option", "params": ["T"], "variants": ["Some(T)", "None"]} + { "name": "Option", "params": ["T"], "variants": ["Some(T)", "None"] } ] } ``` @@ -715,12 +711,12 @@ This structured representation enables: New diagnostic codes for stdlib-related errors: -| Code | Category | Message | -|------|----------|---------| -| `E0018` | Type error | stdlib function applied to wrong type | +| Code | Category | Message | +| ------- | ---------- | -------------------------------------- | +| `E0018` | Type error | stdlib function applied to wrong type | | `E0019` | Type error | missing trait bound for stdlib generic | -| `W0001` | Warning | unused stdlib import | -| `W0002` | Warning | deprecated stdlib function | +| `W0001` | Warning | unused stdlib import | +| `W0002` | Warning | deprecated stdlib function | ## Drawbacks @@ -741,19 +737,19 @@ Rejected. A large stdlib creates maintenance burden and version coupling. Spore' ### No foreign fn (pure Spore stdlib) -Rejected. I/O operations cannot be expressed in pure Spore. The designed MVP +Rejected. I/O operations cannot be expressed in pure Spore. The designed mechanism is Platform-owned package modules plus the selected Platform contract boundary (SEP-0008). ## Prior art -| Language | Stdlib approach | Comparison | -|----------|----------------|------------| -| **Rust** | `std` + `core` (no-std) | Similar layered approach; Spore's is smaller | -| **Roc** | Platform-provided I/O | Direct inspiration for Spore's platform model | -| **Haskell** | `Prelude` + `base` | Similar prelude concept; Spore avoids Haskell's large `base` | -| **Go** | Batteries-included | Larger than Spore's approach | -| **Elm** | Small core + packages | Similar philosophy; Spore adds cost annotations | +| Language | Stdlib approach | Comparison | +| ----------- | ----------------------- | ------------------------------------------------------------ | +| **Rust** | `std` + `core` (no-std) | Similar layered approach; Spore's is smaller | +| **Roc** | Platform-provided I/O | Direct inspiration for Spore's platform model | +| **Haskell** | `Prelude` + `base` | Similar prelude concept; Spore avoids Haskell's large `base` | +| **Go** | Batteries-included | Larger than Spore's approach | +| **Elm** | Small core + packages | Similar philosophy; Spore adds cost annotations | ## Backward compatibility and migration @@ -767,14 +763,19 @@ This is a new specification — no backward compatibility concerns. However: 1. **Mutable state API**: Should `Ref[T]` be in the prelude or `std.state`? Currently referenced in SEP-0001 but not fully specified here. -2. **Concurrency primitives**: `Task[T]`, `Chan[T]`, `select` are defined in SEP-0007 — should they be re-exported via `std.concurrent` or remain as language primitives? +2. **Concurrency primitives**: `Task[T]`, `Channel[T]`, and `select` are defined in SEP-0007 — should they be re-exported via `std.concurrent` or remain as language primitives? 3. **String encoding**: Should `Str` expose byte-level access, or only scalar-oriented indexing? Current spec assumes UTF-8; there is no `Char` type (length-1 `Str` values instead). -4. **Numeric tower**: The reference implementation exposes fixed widths (`I8`…`U64`, `F32`, `F64`). Stdlib APIs should choose explicit widths at each boundary rather than relying on abstract scalar aliases. +4. **Iterator protocol**: Should there be a lazy `Iterator[T]` trait instead of materializing `List[T]` for all operations? This would change cost signatures significantly. -5. **Iterator protocol**: Should there be a lazy `Iterator[T]` trait instead of materializing `List[T]` for all operations? This would change cost signatures significantly. +5. **Error recovery**: How should `PanicError` interact with the effect system? Should `panic` require an effect? -6. **Error recovery**: How should `PanicError` interact with the effect system? Should `panic` require an effect? +6. **FFI type mapping**: How do Spore types map to Rust/C types across the FFI boundary? (e.g., `I64` ↔ `i64`, `Str` ↔ UTF-8 buffer) -7. **FFI type mapping**: How do Spore types map to Rust/C types across the FFI boundary? (e.g., `I64` ↔ `i64`, `Str` ↔ UTF-8 buffer) +### Resolved questions + +1. **Numeric tower**: Resolved by SEP-0002 and SEP-0001. The language surface + uses fixed-width numeric types (`I8` through `U64`, plus `F32` and `F64`), + and stdlib APIs should choose explicit widths at each boundary rather than + relying on abstract scalar aliases.