fix(batch): reconcile pipeline.md inbox after batch runs#712
Conversation
batch-runner.sh records every evaluated offer in batch/batch-state.tsv but never writes back to data/pipeline.md. Offers processed via batch mode stay in the pipeline "Pendientes" inbox indefinitely -- the next scan and the next `/career-ops pipeline` run re-surface them, producing duplicate reports and tracker rows. For anyone running batch regularly the inbox never drains. Add reconcile-pipeline.mjs: for each completed/skipped entry in batch-state.tsv whose URL is still in pipeline.md "Pendientes", move the line to "Procesadas" with its report link, score and PDF flag. It is idempotent -- an already-moved entry is a no-op -- so it is safe to run after every batch. Conservative: failed entries and entries with no report file on disk are left in place. User-supplied --state/--pipeline paths are constrained to the repository tree. batch-runner.sh now calls it from merge_tracker(), between the tracker merge and the integrity check. Also exposed as `npm run reconcile` for standalone use, and covered by test-all.mjs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughThis PR implements automatic pipeline reconciliation for batch mode. When batch-runner.sh completes evaluation, it now calls reconcile-pipeline.mjs to move completed/skipped offers from pipeline.md "Pendientes" to "Procesadas" with report links and scores, preventing duplicate evaluation on re-runs. The reconciliation is idempotent and safe to run standalone or after every batch. ChangesPipeline reconciliation for batch runs
Sequence Diagram(s)sequenceDiagram
participant Script as reconcile-pipeline.mjs
participant State as batch-state.tsv
participant Reports as reports/
participant Pipeline as pipeline.md
Script->>State: parse completed/skipped entries
Script->>Reports: find report files, extract Score/PDF
Script->>Pipeline: parse Pendientes and Procesadas, collect URLs
Script->>Pipeline: move matching Pendientes to Procesadas with report link, score, PDF flag
Script->>Pipeline: write updated pipeline.md (create .pre-reconcile.bak)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Suggested labels
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 ESLint
ESLint skipped: no ESLint configuration detected in root package.json. To enable, add Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: db80901c-ee41-47e2-8734-a7b671d05073
📒 Files selected for processing (5)
batch/README.mdbatch/batch-runner.shpackage.jsonreconcile-pipeline.mjstest-all.mjs
resolveInsideRepo() validated --state/--pipeline lexically only, so a symlink inside the repo could still resolve to a target outside it. Resolve the repo root and the target (or its parent, when the target does not exist yet) with realpathSync before the boundary check. Addresses CodeRabbit review feedback on santifer#712. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
resolveInsideRepo() validated --state/--pipeline lexically only, so a symlink inside the repo could still resolve to a target outside it. Resolve the repo root and the target (or its parent, when the target does not exist yet) with realpathSync before the boundary check. Addresses CodeRabbit review feedback on santifer#712. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@reconcile-pipeline.mjs`:
- Line 24: The path guard currently uses existsSync() but allows directories,
causing readFileSync()/copyFileSync() to later throw EISDIR; update
resolveInsideRepo() to also reject directory targets by using statSync/path or
fs.lstatSync to check isDirectory() and throw a clear error (or exit) when the
target is a directory. Ensure callers like the paths passed into readFileSync,
copyFileSync, and realpathSync (used when handling --state batch / --pipeline
data) receive only regular file paths and update the error message to indicate
the argument was a directory.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 36dd5770-6848-460c-9b10-1a3835007a05
📒 Files selected for processing (1)
reconcile-pipeline.mjs
A directory passed as --state/--pipeline (e.g. `--state batch`) cleared existsSync() and the boundary check, then crashed later with an unhandled EISDIR from readFileSync/copyFileSync. resolveInsideRepo() now rejects directory targets up front with a clear message. Addresses CodeRabbit review feedback on santifer#712. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
A directory passed as --state/--pipeline (e.g. `--state batch`) cleared existsSync() and the boundary check, then crashed later with an unhandled EISDIR from readFileSync/copyFileSync. resolveInsideRepo() now rejects directory targets up front with a clear message. Addresses CodeRabbit review feedback on santifer#712. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes #711.
Problem
batch/batch-runner.shrecords every evaluated offer inbatch/batch-state.tsv, but it never writes back todata/pipeline.md. Offers processed via batch mode therefore stay in the pipeline "Pendientes" inbox indefinitely.On the next
scanor/career-ops pipelinerun those entries get re-surfaced and evaluated a second time — producing duplicate reports and duplicate tracker rows. For anyone running batch regularly, the inbox never drains.Fix
New script
reconcile-pipeline.mjs:batch-state.tsv; for everycompleted/skippedentry whose URL is still inpipeline.md"Pendientes", moves that line to "Procesadas" with its report link, score and PDF flag.failedentries stay in "Pendientes" (they were not evaluated); an entry whose report file is missing on disk is left in place rather than writing a dead link.--state/--pipelinearguments are constrained to the repository tree.Pendientes/ProcesadasandPending/Processedsection headers.batch-runner.shcalls it frommerge_tracker(), between the tracker merge and the integrity check. Also exposed asnpm run reconcilefor standalone/manual use.Testing
test-all.mjscovers the new script (syntax + graceful-run on empty data); full suite green.batch-state.tsv+pipeline.md→ the matching entry moves to "Procesadas" with the correct report link/score/PDF; a second run reports "already in sync".Notes
pipeline.mdtopipeline.md.pre-reconcile.bakbefore writing.🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Documentation
Tests