Skip to content

Conversation

@lbrecic
Copy link
Contributor

@lbrecic lbrecic commented Nov 21, 2025

Description

This pull request improves the reliability and maintainability of the snapshot propagation logic in the SnapshotManager. The main changes include refactoring the way snapshots are propagated to runners, enhancing error logging, and adjusting lock durations for synchronization. These updates help ensure better error handling, clearer code, and more robust background job execution.

Snapshot propagation and error handling improvements:

  • Refactored the propagateSnapshotToRunners method to accept a Snapshot object instead of just a reference string, and updated all related usages to pass the full snapshot. This improves clarity and consistency throughout the code.
  • Enhanced error handling in the snapshot propagation process by using Promise.allSettled and logging all promise rejections, ensuring that errors are not silently ignored.

Synchronization and scheduling adjustments:

  • Increased the Redis lock duration for the syncRunnerSnapshots job from 30 seconds to one hour, reducing the risk of concurrent executions.
  • Removed the waitForCompletion: true option from the @Cron decorator for the check-snapshot-state job, which may improve job scheduling and prevent potential blocking.

Type and import clean-up:

  • Updated imports to use FindOptionsWhere for improved type safety and query clarity in TypeORM operations.

Documentation

  • This change requires a documentation update
  • I have made corresponding changes to the documentation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants