Skip to content

Dead letter queue and resume failed tasks #46

@psilabs-dev

Description

@psilabs-dev

In the event that a task fails (due to internet blackout, temporary DNS failure), worker should send failed message to a DLQ instead of marking it as complete. Then, server should provide APIs to: list contents in DLQ, resume tasks from DLQ, resume particular task from DLQ, and drop tasks from DLQ.

The expectation is that DLQ should be used for things which qualify for future retry. It should NOT be used for things which cannot be retried, such as:

  • image not being available
  • not authorized/forbidden

Effectively, pure-connection-related issues right now qualify for DLQ.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions