Maintenance script followups: compaction tuning, dedup, orphan recovery, fsck#49
Maintenance script followups: compaction tuning, dedup, orphan recovery, fsck#49
Conversation
Modern libpq with standard_conforming_strings=on (default since 9.1) rejects the old \' escape form. Switch to '' doubling and drop the no-longer-needed backslash handling. Same fix in millpond/ducklake.py and tools/maintenance.py; unit tests updated to match.
The prior-value read in _scoped_target_file_size was unreliable: DuckLake
stores target_file_size as a byte-count string and ducklake_options('lake')
can return multiple rows across GLOBAL/SCHEMA/TABLE scopes, so the restore
SET intermittently failed with ParserException. Drop the read and always
restore to the documented steady-state default.
DuckDB's default spill path '.tmp' relative to CWD is on the read-only rootfs in the millpond image, so any compaction that needed to spill crashed. Point spills at /tmp, the writable emptyDir in the cron pod.
Both add per-call overhead that compounds across 30k+ S3 deletes and many catalog reads. Add a top-level --debug flag to opt back in for short-lived debugging.
DuckLake's Postgres metadata schema is named __ducklake_metadata_<attach>, so the attach name and the schema name must be derived from one source or queries silently target the wrong schema. Introduce ATTACH_NAME and METADATA_SCHEMA constants and route every DuckLake call through them. No behavior change.
The legacy SET s3_* form is honored for some duckdb/httpfs paths but
not for ad-hoc glob('s3://...') / read_parquet('s3://...') in 1.4, so
operators got 403s mid-incident until they ran CREATE SECRET by hand.
Switch the shared _setup block to CREATE OR REPLACE SECRET s3 so every
shell session starts with S3 access wired up.
Every maintenance recipe we keep adding (dedup, orphan-clean, ctid-based ops) needs the upstream Postgres ATTACHed via the duckdb postgres extension. Bake it in alongside the lake ATTACH so operators don't have to type the connection string by hand. Extract pg_connstr to share the libpq form between the ducklake-catalog and pg ATTACHes.
DuckDB 1.4 raises 'Required module pytz failed to import' on any Python fetch from a TIMESTAMPTZ column; stdlib zoneinfo is not accepted as a substitute. Verified empirically against duckdb 1.4.4. ~500 KB on the image; cheap insurance against the next time a maintenance recipe SELECTs a snapshot timestamp.
Future tooling that reads the catalog directly (not through ducklake_* SQL functions) needs to know the schema name is derived from the ATTACH alias. Point readers at the ATTACH_NAME / METADATA_SCHEMA constants in maintenance.py so the two never drift.
ducklake_merge_adjacent_files isn't fully streaming today (DuckLake bug c8), so the catalog-wide merge happily consumes 24+ GiB / 12 threads on a real lake and OOMKills the cron pod. Set conservative defaults via a new _set_compaction_tuning helper (threads=2, memory_limit=4GB, preserve_insertion_order=false, http_timeout=600s) and expose --threads / --memory-limit so operators can raise the ceiling on lakes that fit. compact-probe is unchanged — it caps work via max_compacted_files.
Snapshot the ducklake_files_scheduled_for_deletion queue depth before
and after cleanup, time the call, emit one key=value line:
cleanup-all throughput: files_processed=N queue_depth_before=A
queue_depth_after=B elapsed_s=T rate_obj_s=R
Lets an operator confirm the steady-state ~5 obj/s and feed the number
back to upstream DuckLake bug c2 (per-file vs per-batch commits in
cleanup_old_files) without enabling debug logging.
The same path can land in ducklake_files_scheduled_for_deletion across multiple snapshots (DuckLake bug c5); the second visit by cleanup-all hits NoSuchKey on S3, which currently rolls back the whole transaction (DuckLake bug c1) and corrupts the queue. Until those land upstream, operators need to dedup the queue before every cleanup-all run. This commit adds: * tools/maintenance.sql — runtime-loaded SQL file shared by maintenance.py and the just shell recipe. Header documents the constraints any new recipe must follow (no LEFT ANTI JOIN, no duckdb-side ctid, advisory lock convention) and defines the count_pending_dups() macro. * connect() ATTACHes the upstream Postgres as 'pg' for postgres_execute / postgres_query, then sources maintenance.sql so macros are available in every session (cron and interactive). * dedup-deletions subcommand and justfile recipes — DELETE … WHERE ctid NOT IN (SELECT MIN(ctid) … GROUP BY path) via postgres_execute. * Dockerfile copies the SQL file alongside maintenance.py. * README updated to reflect the current maintenance surface (dedup, cleanup-all, compact tuning flags, throughput log line, debug flag, maintenance.sql macros).
Catalog-side orphans — rows in ducklake_files_scheduled_for_deletion whose S3 key no longer exists — poison cleanup-all because the S3 DELETE returns NoSuchKey and the whole txn rolls back (DuckLake bug c1). Operators need to be able to enumerate them before deciding to heal. The macro takes the data path as a parameter rather than baking it in: DuckDB 1.4 evaluates a literal glob() inside CREATE MACRO eagerly, so a no-arg macro with a templated path would fire an S3 LIST on every connect(), even for cron runs that don't care. The Python find-orphans subcommand passes DUCKLAKE_DATA_PATH for you; ad-hoc users from just shell pass the path themselves. Adds tools/maintenance.sql header note about the eager-eval gotcha so the next person doesn't repeat the experiment.
Add _acquire_advisory_lock helper that calls pg_try_advisory_lock on the
pg ATTACH and raises if another maintenance session already holds it.
Wire it into dedup-deletions before the DELETE so concurrent invocations
of the same recipe can't race the same rows.
The lock is keyed off hashtext('millpond-ducklake-maintenance')::bigint
and released implicitly on session close, so single-subcommand runs need
no explicit release. The helper is the shared entry point for the rest
of the SQL-first recipes (heal-orphans, cleanup-all-safe, fsck) that
will land in subsequent commits.
Five-step gated procedure that deletes catalog rows from
ducklake_files_scheduled_for_deletion whose S3 key no longer exists.
The set is materialized into a TEMP TABLE so the safety gates and the
final DELETE all see the same snapshot:
B1: ducklake_data_file is non-empty AND zero of the orphan paths
appear there (no vacuous pass on an empty catalog; no false
positive that would delete a still-live file)
B3: zero positional delete vectors (ducklake_delete_file) reference
an orphan data_file_id (such a file is still live for vector
lookups)
H4: advisory lock acquired before the DELETE
B2: DELETE matches on path, which is UUIDv7-globally-unique per
quirk r3
H1: single postgres_execute call so the DELETE is atomic at the
upstream Postgres layer
Adds a small _sql_string_literal helper for safe quoting of the
nested SQL strings sent through postgres_execute, and threads it
through the dedup-deletions DELETE for consistency.
Loop dedup-deletions + heal-orphans + cleanup-all under the advisory lock until cleanup-all exits clean. Until DuckLake bug c1 lands, every crashed cleanup-all (NoSuchKey on S3 DELETE rolls back the whole txn) creates fresh catalog-side orphans, so the next cleanup-all attempt needs the queue healed before it can succeed. Cap at --max-iterations (default 10) and raise non-zero if exhausted; the lock is taken once for the whole orchestration so the steps share mutual exclusion.
Wraps cleanup-all-safe (dedup + heal-orphans + cleanup-all in a loop) followed by ducklake_delete_orphaned_files to mop up S3-side orphans from any prior interrupted writes. Tiered compaction is intentionally not bundled — run compact-to-tier-N separately. Dry-run reports duplicate-queue depth, catalog-side orphan count, and the S3-orphan dry-run output without mutating anything; pair with fsck-dry-run in the justfile for ad-hoc audits.
Found by canary against the docker-compose lake:
1. connect() configured S3 access via the legacy SET s3_* form only.
The ducklake catalog driver honors that form, but DuckDB 1.4's httpfs
does not — ad-hoc glob('s3://...') / read_parquet('s3://...') go
through the SECRET manager. find-orphans (which uses glob() inside
the find_catalog_orphans macro) failed with HTTP 403 on every real
call. Mirror the same credentials into a CREATE OR REPLACE SECRET
alongside the existing SETs so both paths authenticate.
2. _acquire_advisory_lock embedded the inner SQL into postgres_query
without doubling the single quotes around 'millpond-ducklake-
maintenance', producing a ParserException at the duckdb-side parser.
Use the existing _sql_string_literal helper to wrap the inner SQL,
matching the pattern dedup-deletions and heal-orphans already use.
There was a problem hiding this comment.
Pull request overview
Adds a broad set of maintenance-tool follow-ups around DuckLake operations: safer cleanup/orphan recovery workflows, lower-default resource usage for compaction, and supporting SQL/docs/image plumbing so operators can run these recipes from the shipped container.
Changes:
- Extends
tools/maintenance.pywith new maintenance subcommands, advisory locking, throughput logging, compaction tuning defaults, and SQL macro loading. - Adds
tools/maintenance.sql, wires it into the image and interactive tooling, and updates docs/agent guidance around the new maintenance flow. - Adds
pytzto runtime dependencies and updates libpq escaping/tests for current Postgres/libpq behavior.
Reviewed changes
Copilot reviewed 9 out of 11 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
uv.lock |
Locks the new pytz runtime dependency. |
tools/maintenance.sql |
Adds reusable DuckDB macros and maintenance SQL conventions. |
tools/maintenance.py |
Implements the new maintenance/orphan-recovery/compaction behavior. |
tools/justfile |
Expands operator recipes and interactive shell setup. |
tests/unit/test_ducklake.py |
Updates _escape_libpq expectations. |
README.md |
Documents the expanded maintenance workflow and commands. |
pyproject.toml |
Declares pytz as a runtime dependency. |
millpond/ducklake.py |
Updates libpq escaping implementation/docs in app code. |
Dockerfile |
Copies the new SQL macro file into the image. |
AGENT.md |
Records new maintenance-tooling conventions and metadata-schema guidance. |
.gitignore |
Ignores FOLLOWUPS.md. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Five subcommands and the supporting plumbing landed without docs: find-orphans, heal-orphans (with B1/B3 safety gates), cleanup-all-safe, fsck, plus the advisory-lock convention and the SECRET-vs-legacy-S3 split in connect(). Adds a 'Catalog-side orphan recovery' subsection to the README's DuckLake Maintenance section and expands AGENT.md's Maintenance Tooling section to cover the recovery toolkit, the SQL macro file's eager-eval gotcha, and the throughput log line.
The .sql file is loaded two ways: maintenance.py runs it through
str.format() before conn.execute(), but the just shell recipe sources
it via the duckdb CLI's .read meta-command, which reads it verbatim.
The {schema} placeholders survived the second path, so opening the
shell hit a parser error instead of loading the advertised macros.
Drop the templating: write __ducklake_metadata_lake literally, update
the file's header to document the no-templating rule, drop the
.format() call in connect(). Both paths now execute the same bytes.
ATTACH_NAME / METADATA_SCHEMA still exist for Python-side queries.
AGENT.md updated to match.
Previously the SECRET was hardcoded to s3.<region>.amazonaws.com / URL_STYLE 'vhost' / USE_SSL true, which broke just shell against the repo's own MinIO compose stack and any non-AWS S3-compatible target, even though maintenance.py picked up these env vars correctly. Read all three from env with the same AWS-style defaults maintenance.py uses, so AWS deployments still work without any extra config and MinIO / R2 / etc. work out of the box once the env is set.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 9 out of 11 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
The docstring claimed step 1 was 'take the advisory lock' but the code took it at step 5 (right before the DELETE), after the scan and the B1/B3 gates had already run. For the standalone heal-orphans invocation that left a window where another maintenance writer could change ducklake_files_scheduled_for_deletion between scan and DELETE, so the gates had passed against state that no longer held by the time we mutated. (The cleanup-all-safe orchestrator wasn't affected — it took the lock once for the whole loop — but the standalone subcommand is exactly the case the docstring's snapshot claim was made for.) Move the acquire to the top of heal_orphans, before materializing. Skip on --dry-run since dry-run doesn't mutate and the dry-run output is informational only.
The dry-run path inlined raw count queries against count_pending_dups() and find_catalog_orphans() and then called orphans(dry_run=True). It never went through heal_orphans, so the B1 (vacuous-pass + 'orphan' still live) and B3 (positional delete vector references) gates were skipped — a real fsck that would abort during heal-orphans showed as healthy in the dry-run. Delegate to the dry-run forms of the individual steps. heal_orphans with dry_run=True runs both gates and raises on failure, so the dry-run now produces the same go/no-go signal the real fsck would.
29 tests covering tier A (pure helpers and argparse plumbing) and tier B (orchestrator logic with mocked sub-calls): * _sql_string_literal quote-doubling * _log_cleanup_throughput message shape, divide-by-zero guard * argparse plumbing for the new subcommands (dedup-deletions, find-orphans, heal-orphans, cleanup-all-safe, fsck) plus --threads / --memory-limit on compact * cleanup_all_safe retry loop: succeed-on-first, retry-after-IOException, exhaust-iterations * fsck: dry-run delegates to dry-run sub-calls (so heal-orphans gates actually run), real run goes through cleanup_all_safe, gate failure propagates out of dry-run * _set_compaction_tuning emits expected SETs and rejects injection * _acquire_advisory_lock emits the doubled-quote postgres_query form and raises when another session holds the lock Tier C (macros against a stubbed catalog) and tier D (full e2e against the docker-compose stack) remain follow-up work — they need either an in-process duckdb with stubbed schemas or a real lake. Adds pythonpath = ["tools"] to pytest config so 'import maintenance' resolves.
…e default
The previous restore logic checked 'restore == DEFAULT_TARGET_FILE_SIZE'
to decide whether to log the conversion-failure warning, but
_bytes_to_human('134217728') returns '128MiB', which is exactly
DEFAULT_TARGET_FILE_SIZE. Healthy installs that already use the
default emitted a warning on every compaction run, making it look
like the restore was broken when it wasn't.
Gate the warning on 'converted is None' (the genuine conversion
failure case) instead. Add 6 unit tests pinning the four cases:
default-value round-trip → no warning; non-default convertible value
→ no warning; non-power-of-1024 input → warning; prior unset → no
warning; plus assertions that the restore SET targets the converted
form and the default fallback respectively.
Also tidy the awkward 'if v else None' validation pattern in
connect()'s SECRET path to a plain 'if v: _sanitize_setting_value(v)'.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 10 out of 12 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Earlier in this branch a7 changed _escape_libpq to SQL-style ''
doubling on the premise that 'modern libpq with
standard_conforming_strings=on requires ""'. That premise
conflated two different parsers:
* The Postgres SQL parser interprets string literals; "" is
the embedded-quote form there, governed by
standard_conforming_strings.
* The libpq connection-string parser is a separate grammar that
has always required backslash escapes (\' and \\) inside
quoted values. It is not affected by
standard_conforming_strings.
Applying SQL rules at the libpq layer broke every code path that
builds a connstring once a value contains an apostrophe — which
never showed up in canary because every test password was alphanumeric.
Restore the original backslash escaping in millpond/ducklake.py
and tools/maintenance.py with a docstring that names the grammar
explicitly so this doesn't get re-broken. Revert the unit tests
to assert the libpq form. In tools/justfile, rewrite the inner
(libpq) layer to backslash-escape \ and ', then keep the outer
SQL-wrap layer unchanged (every ' in the resulting connstring
still gets doubled because the whole thing is wrapped in
'ATTACH '...' AS lake').
Verified end-to-end against a postgres:17 container booted with
POSTGRES_PASSWORD=O'Brien:
* maintenance.py connect() authenticates and ATTACHes
* just shell (run inside the millpond image so duckdb-cli +
duckdb-python versions match) authenticates via the pg
ATTACH and returns 'PostgreSQL 17.9 ...' from
postgres_query('pg', 'SELECT version()').
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 10 out of 12 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Two real issues from the latest review: * tools/justfile baked SET pg_debug_show_queries = true into the shell setup, so every interactive session paid the per-call overhead this PR's a14 commit had explicitly defaulted to off in maintenance.py. Mirror the maintenance.py policy: default both enable_http_logging and pg_debug_show_queries to false in _setup; operators who want them temporarily can SET them after entering the shell. * The find_catalog_orphans / count_pending_dups macros had no automated coverage, even though path-normalization in that area is exactly where heal-orphans's B1 gate had a real regression. Add tests/unit/test_maintenance_macros.py: 11 tests against an in-memory DuckDB with a stub __ducklake_metadata_lake schema and a real local-filesystem glob, covering count_pending_dups arithmetic and find_catalog_orphans for absolute / relative / mixed path forms (per quirk r1) plus the empty-queue and empty-listing edge cases.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 11 out of 13 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Three issues from review:
1. heal-orphans's B1 gate counted every row in ducklake_data_file
without filtering on end_snapshot. Historical (expired) rows
matching an orphan path made would_be_live non-zero and aborted
heal-orphans on perfectly valid queue entries. The compact
subcommand uses 'end_snapshot IS NULL' as the live-row marker
throughout — apply the same filter to the B1 gate's data_abs CTE
and to its 'vacuous catalog' guard so an empty live set is
distinguished from an empty catalog. Same fix to B3 against
ducklake_delete_file: only LIVE positional-delete vectors should
block heal-orphans; expired ones are not load-bearing.
2. tools/maintenance.sql hard-codes __ducklake_metadata_lake because
in 'just shell' is verbatim and templating is not
compatible with that load path (we tried; it produced parser
errors). Add a unit test that fails loudly if anyone changes
ATTACH_NAME without updating the .sql file's references — the
.sql header documented the coupling but had no enforcement, and
the bot correctly flagged that as a single-source-of-truth gap.
3. Add 6 unit tests against the stub schema covering both gates with
live, expired, and mixed end_snapshot values, plus the schema-
coupling assertion.
Also updated the PR description to reflect what _escape_libpq and
_scoped_target_file_size actually do at the current branch tip
(the earlier table claimed behavior that was reverted partway
through review).
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 11 out of 13 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
A DUCKLAKE_DATA_PATH ending in '/' (e.g. s3://bucket/lake/data/) is a common operator-supplied form. Both find_catalog_orphans (in maintenance.sql) and the B1 safety gate (in heal_orphans) concatenated 'data_path || / || key' without stripping a trailing slash, so a relative-form queue path normalized to '.../data//file.parquet' that never matched the absolute form '.../data/file.parquet'. Net result: find-orphans flagged live files as orphans, and heal-orphans's gate let through queue entries pointing at still-live files. Wrap data_path in rtrim(data_path, '/') everywhere it concatenates a key. Also broaden the B1 gate's absolute-path detection from 'path LIKE s3://%' to 'LIKE s3://% OR LIKE /%' so a /-rooted path (stored locally during tests; defensible if it ever appears in production via misconfiguration) isn't doubly-prefixed by the data path. Adds 4 tests pinning the trailing-slash variant: find_catalog_orphans with absolute-form queue, with relative-form queue, with a real orphan present, and the B1 gate against a still-live file via the relative-form queue path.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 11 out of 13 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Three issues from review plus two adjacent ones I caught in audit: * The cleanup throughput line derived files_processed from a queue-depth delta. Without holding the advisory lock (and the lock doesn't serialize against arbitrary writers anyway) the delta is wrong whenever any other writer enqueues during the call — could even go negative. Switch to len(result), the actual rows ducklake_cleanup_old_files returned. Accurate regardless of concurrent writers. Drop queue_depth_before; keep queue_depth_after as a remaining-work signal. * Skip the throughput log on cleanup --dry-run. dry-run returns preview rows, so len(result) would claim work was done when none was — the old delta version printed 0 by accident. * Validate DUCKDB_S3_USE_SSL in tools/justfile against a strict whitelist. Previously forwarded raw into both SET s3_use_ssl (SQL string literal, vulnerable to a quote) and CREATE OR REPLACE SECRET USE_SSL (unquoted, raw SQL injection). Just-side if/else error fails fast on bad input; matches the equivalent guard in maintenance.py via _sanitize_setting_value. * Extract _heal_orphans_b1_counts as a module-level helper so production heal_orphans and unit tests share one source of truth for the gate query. * Add 5 unit tests covering the gates LIKE s3 branch with literal s3:// paths in queue / data_file (matched, mixed forms, trailing- slash data_path, expired rows). Pre-existing tests used local- filesystem paths only.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 11 out of 13 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
The previous commit changed the throughput log shape (dropped queue_depth_before; switched files_processed from a queue-depth delta to len(result); skipped on --dry-run) but the README example and the AGENT.md rationale still described the old shape. Operators copying the README example would have searched for a field that doesn't exist. Update both to the current shape and call out that files_processed is the operation's return-row count (race-free) rather than a delta.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 11 out of 13 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| "%s throughput: files_processed=%d elapsed_s=%.1f rate_obj_s=%.1f queue_depth_after=%d", | ||
| operation, |
| conn.execute(f"SET enable_http_logging = {str(debug).lower()}") | ||
| conn.execute(f"SET pg_debug_show_queries = {str(debug).lower()}") |
Summary
Eighteen small, mostly-mechanical improvements to
tools/maintenance.pyand its surrounding scaffolding (justfile, SQL macro file, image, README/AGENT). Three themes: (1) plug latent bugs we hit operating cleanup against real lakes, (2) add a SQL-first orphan-recovery library so the manual recovery dance becomes a recipe, (3) makecompactnot OOM the cron pod by default.Every commit is one item from the team followups doc.
Bug fixes & defaults
_escape_libpq\'/\\, distinct from the Postgres SQL parser's''doubling —standard_conforming_stringsgoverns the SQL parser, not libpq). An earlier commit on this branch mistakenly switched to SQL-style doubling and was reverted. Docstrings now name the grammar explicitly to prevent a re-occurrence._scoped_target_file_sizetarget_file_size(filtered to scope ='GLOBAL'to avoid the original implementation's empty-tuple bug) and restore it on exit. DuckLake stores the value as a raw byte count; convert via a new_bytes_to_humanhelper that picks the largest 1024^i unit that divides cleanly. Fall back toDEFAULT_TARGET_FILE_SIZEonly when conversion fails or the prior value is unset (operator-configured non-default values are preserved).temp_directorySET temp_directory = '/tmp/duckdb_spill'inconnect()so spills land on the writable emptyDir, not the read-only rootfs.enable_http_logging/pg_debug_show_queries--debugflag opts back in. Both add per-call overhead that compounds across tens of thousands of S3 deletes.cleanupthroughput logkey=valueline per cleanup:files_processed=N queue_depth_before=A queue_depth_after=B elapsed_s=T rate_obj_s=R.Compaction tuning
--threads(default 2),--memory-limit(default 4GB) oncompactducklake_merge_adjacent_filesisn't fully streaming in 1.4 and over-uses memory relative to input volume; conservative defaults keep the cron pod alive on real lakes.preserve_insertion_order=false,http_timeout=600000SQL-first orphan-recovery library
DuckLake bug c5 (duplicate paths in
ducklake_files_scheduled_for_deletion) combined with c1 (S3 NoSuchKey rolls back the whole txn) means every interruptedcleanup-allcan leave the catalog in a worse state than it started. New subcommands form the fix-it-yourself toolkit until c1/c5 land upstream:dedup-deletionsDELETE … USING MIN(ctid)throughpostgres_execute.find-orphansdata_file_id<TAB>pathto stdout.heal-orphansducklake_data_filenon-empty AND zero orphan paths still live) and B3 (zero positional-delete vectors reference orphan ids) safety gates, thenDELETE … WHERE path IN (...)throughpostgres_execute.cleanup-all-safe--max-iterations(default 10).fsckcleanup-all-safe+ducklake_delete_orphaned_files(catalog-healthy end-to-end).tools/maintenance.sql(new) holds the runtime-loaded macros (count_pending_dups(),find_catalog_orphans(data_path)) plus a header documenting the constraints (noLEFT ANTI JOIN, no duckdb-sidectid, advisory-lock key). Loaded by bothmaintenance.pyconnect and thejust shellrecipe.Advisory lock keyed off
hashtext('millpond-ducklake-maintenance')::bigint, taken on thepgATTACH; provides mutual exclusion between maintenance invocations (not catalog-write atomicity against arbitrary writers — documented).Plumbing
ATTACH_NAME = "lake"/METADATA_SCHEMAconstantspg (TYPE postgres)ATTACHconnect()and tojust shellsopostgres_execute/postgres_queryare available everywhere.CREATE OR REPLACE SECRET s3SET s3_*in bothconnect()andjust shell— the legacy form isn't honored by ad-hocglob('s3://...')/read_parquet('s3://...')in 1.4.pytzdependencyRequired module 'pytz' failed to importon anyTIMESTAMPTZfetch; stdlibzoneinfonot accepted. ~500 KB.Real-stack canary (docker-compose lake)
Caught two bugs I'd otherwise have shipped:
find-orphans403 becauseconnect()only set legacy S3 env varsCREATE OR REPLACE SECRET s3cleanup-all-safeParserException on advisory-lock SQL_sql_string_literalEnd-to-end verified against a populated lake:
connect()(ATTACH lake + pg, load macros, set SECRET)find-orphanson a clean lakededup-deletions --dry-run/heal-orphans --dry-run/fsck --dry-runcleanup-all-safeagainst a 3,977-row queuecount_pending_dups()/find_catalog_orphans()macrosNot exercised in canary:
heal-orphansagainst manufactured orphans (skipped aftermctooling friction), the NoSuchKey-retry loop incleanup-all-safe(would need to deliberately trigger DuckLake bug c1).Test plan
just lint+just fmt-checkcleantools/maintenance.sqlto/app/tools/maintenance.sqlpython /app/tools/maintenance.py fsck --dry-runreports counts without erroringcleanup-all-safeagainst a non-empty queue drains it and emits the throughput log linecompact --tier 1honors--threads 2 --memory-limit 4GBdefaults (visible in pod resource use)