feat: add methods to list and manage failed jobs in Redis by ren0503 · Pull Request #69 · tinh-tinh/queue

ren0503 · 2026-01-17T03:16:29Z

Add GetFailedJobs() to retrieve all failed jobs from Redis
Add GetFailedJob(jobId) to get a specific failed job's error
Add ClearFailedJobs() to remove all failed job records
Add scanFailedJobKeys() helper to eliminate code duplication
Reuse existing Job struct instead of creating new FailedJobInfo
Add comprehensive unit tests in failed_jobs_test.go

Failed jobs are stored in Redis after exhausting all retries. These methods allow listing, inspecting, and clearing failed jobs.

Closes #68

coderabbitai · 2026-01-17T03:16:38Z

Summary by CodeRabbit

New Features
- Retrieve a list of failed jobs with failure reasons and statuses
- Retrieve failure details for a specific failed job by ID
- Clear all failed jobs from the queue, with idempotent behavior
Tests
- Added comprehensive tests covering listing, retrieval, clearing, idempotency, and error scenarios

_{✏️ Tip: You can customize this high-level summary in your review settings.}

Walkthrough

Adds Redis-backed failed-job management to Queue: scanning failed-job keys, listing all failed jobs, fetching a failed job by ID, and clearing failed-job entries. New tests validate listing, lookup, clearing, and Redis error handling.

Changes

Cohort / File(s)	Summary
Failed-job API and Redis operations `queue.go`	Added `(Queue) scanFailedJobKeys() ([]string,error)`, `(Queue) GetFailedJobs() ([]Job,error)`, `(Queue) GetFailedJob(jobId string) (string,error)`, and `(Queue) ClearFailedJobs() error`. Implements Redis SCAN for `failed:*` keys, reads failure reasons, handles not-found/Redis errors, and deletes failed-job keys.
Tests for failed-job behavior `failed_jobs_test.go`	New test suite in `queue_test` exercising `GetFailedJobs`, `GetFailedJob`, and `ClearFailedJobs` (including Redis error cases). Tests enqueue failing jobs, assert failure reasons/statuses, verify clearing, and validate error messages for Redis connectivity issues.

Sequence Diagram(s)

sequenceDiagram
  participant Producer as Producer
  participant Queue as Queue
  participant Worker as Worker
  participant Redis as Redis
  participant Client as Client

  rect rgba(200,200,255,0.5)
    Producer->>Queue: AddJob(job)
    Queue->>Redis: RPUSH queue:list job
  end

  rect rgba(200,255,200,0.5)
    Worker->>Redis: BRPOP queue:list
    Worker->>Worker: Process job (fails)
    Worker->>Redis: SET failed:<jobId> = failureReason
  end

  rect rgba(255,200,200,0.5)
    Client->>Queue: GetFailedJobs()
    Queue->>Redis: SCAN failed:* -> keys
    Queue->>Redis: MGET keys -> failureReasons
    Queue-->>Client: []Job{Id, FailureReason, Status}
  end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

feat: add prefix to interaction with redis #61: Adjusts key-prefixing/getKey logic that affects how failed-job Redis keys are named and accessed (strongly related).

Poem

🐰 Hopping through Redis late at night,

I sniff the keys that lost their fight.
I list their woes and fetch their names,
Then clear the tracks of failed job games.
🥕

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly summarizes the main change: adding methods to manage failed jobs in Redis, which matches the core functionality in the changeset.
Description check	✅ Passed	The description is directly related to the changeset, detailing the specific methods added and their purpose for managing failed jobs in Redis.
Linked Issues check	✅ Passed	The PR fully implements issue `#68` requirements: provides methods to list failed jobs (GetFailedJobs, GetFailedJob), displays relevant metadata (job ID, failure reason, status), and improves visibility for monitoring and debugging.
Out of Scope Changes check	✅ Passed	All changes are in scope: four new methods for failed job management, a helper method, and comprehensive tests directly address the linked issue `#68` objectives.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codecov-commenter · 2026-01-17T03:18:16Z

Codecov Report

❌ Patch coverage is 86.00000% with 7 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
queue.go	86.00%	5 Missing and 2 partials ⚠️

📢 Thoughts on this report? Let us know!

coderabbitai

Actionable comments posted: 3

🤖 Fix all issues with AI agents

In `@failed_jobs_test.go`:
- Around line 49-50: Replace the fixed time.Sleep(500 * time.Millisecond) calls
in failed_jobs_test.go with an eventual/polling assertion: repeatedly check the
expected condition (e.g., call the same helper that inspects job state such as
listFailedJobs/getJobStatus or the DB/query used elsewhere in this test) at
short intervals until it becomes true or a reasonable timeout elapses, failing
the test on timeout; do this for the three sleep sites (the Sleep at the shown
diff and the other occurrences around lines noted 101-103 and 149-150) so tests
wait deterministically for the expected state instead of sleeping a fixed
duration.
- Around line 14-23: Add a Redis connectivity gate before creating the
failedQueue in failed_jobs_test.go: attempt a quick ping/connect to the same
Redis address used in queue.New (localhost:6379) and if it fails, call t.Skipf
(or skip the test) unless an explicit env override (e.g., RUN_REDIS_TESTS=true)
is set; apply the same pattern to the other places noted (lines ~75-83 and
~120-128) so any code that constructs queues via queue.New is guarded by the
connectivity check and optional env override.

In `@queue.go`:
- Around line 507-531: In GetFailedJobs, don't swallow all q.client.Get(...)
errors; only ignore redis.Nil (missing key) and propagate any other errors
(network/timeouts) up to the caller. Update the loop in GetFailedJobs to check
if err == redis.Nil and continue, otherwise return nil, err; keep using
scanFailedJobKeys and the same job construction (Id, FailedReason, Status,
queue) when a value is present. Ensure the error handling mirrors the pattern
used in GetFailedJob so transient Redis failures are not hidden.

🧹 Nitpick comments (1)

queue.go (1)
479-501: Consider adding explicit :failed: namespace prefix for better key isolation.

Currently, scanFailedJobKeys() uses the pattern cachedKey + ":*" which works correctly because rate limiter keys use only cachedKey (without colon). However, this relies on an implementation detail. Adding an explicit :failed: namespace would make the separation clearer and prevent accidental collisions if new key types are added later.

The silent error handling in GetFailedJobs (lines 515–518) is intentional—it retrieves available job records despite transient Redis errors, which is reasonable behavior.
🔧 Suggested refactoring (optional)
+func (q *Queue) failedKeyPrefix() string {
+	return q.cachedKey + ":failed:"
+}
+
 func (q *Queue) scanFailedJobKeys() ([]string, error) {
-	pattern := q.cachedKey + ":*"
+	pattern := q.failedKeyPrefix() + "*"
@@
-	jobId := strings.TrimPrefix(key, q.cachedKey+":")
+	jobId := strings.TrimPrefix(key, q.failedKeyPrefix())
@@
-	key := q.cachedKey + ":" + jobId
+	key := q.failedKeyPrefix() + jobId
Also applies to job.go line 129 if desired.

coderabbitai · 2026-01-17T03:22:49Z

failed_jobs_test.go

+	failedQueue := queue.New("failed_jobs_test", &queue.Options{
+		Connect: &redis.Options{
+			Addr:     "localhost:6379",
+			Password: "",
+			DB:       0,
+		},
+		Workers:       3,
+		RetryFailures: 0, // No retries, so jobs fail immediately
+	})
+


⚠️ Potential issue | 🟡 Minor

Gate Redis-backed tests when Redis isn’t available.

These tests hard-depend on a local Redis at localhost:6379, which will fail in many dev/CI environments. Add a connectivity check and skip if Redis is unreachable (or allow override via env var).

💡 Suggested pattern (apply to each test)

import ( + "context" "errors" "testing" "time" @@ "github.com/redis/go-redis/v9" "github.com/stretchr/testify/require" "github.com/tinh-tinh/queue/v2" ) +func requireRedis(t *testing.T, opt *redis.Options) { + t.Helper() + client := redis.NewClient(opt) + if err := client.Ping(context.Background()).Err(); err != nil { + t.Skipf("redis not available: %v", err) + } +} + func Test_GetFailedJobs(t *testing.T) { - failedQueue := queue.New("failed_jobs_test", &queue.Options{ - Connect: &redis.Options{ - Addr: "localhost:6379", - Password: "", - DB: 0, - }, + redisOpt := &redis.Options{ + Addr: "localhost:6379", + Password: "", + DB: 0, + } + requireRedis(t, redisOpt) + failedQueue := queue.New("failed_jobs_test", &queue.Options{ + Connect: redisOpt, Workers: 3, RetryFailures: 0, // No retries, so jobs fail immediately })

Also applies to: 75-83, 120-128

🤖 Prompt for AI Agents

In `@failed_jobs_test.go` around lines 14 - 23, Add a Redis connectivity gate before creating the failedQueue in failed_jobs_test.go: attempt a quick ping/connect to the same Redis address used in queue.New (localhost:6379) and if it fails, call t.Skipf (or skip the test) unless an explicit env override (e.g., RUN_REDIS_TESTS=true) is set; apply the same pattern to the other places noted (lines ~75-83 and ~120-128) so any code that constructs queues via queue.New is guarded by the connectivity check and optional env override.

coderabbitai · 2026-01-17T03:22:49Z

failed_jobs_test.go

+	// Wait a bit for jobs to be processed
+	time.Sleep(500 * time.Millisecond)


⚠️ Potential issue | 🟡 Minor

Avoid fixed sleeps; they’re prone to flakes.

Replace time.Sleep with an eventual assertion (poll until the expected state appears). This avoids timing sensitivity under load/slow CI.

♻️ Example replacement (apply similarly to other sleeps)

- // Wait a bit for jobs to be processed - time.Sleep(500 * time.Millisecond) - - // Retrieve failed jobs - failedJobs, err := failedQueue.GetFailedJobs() - require.Nil(t, err) - require.Equal(t, 3, len(failedJobs)) + // Wait until failed jobs are visible + var failedJobs []queue.Job + var err error + require.Eventually(t, func() bool { + failedJobs, err = failedQueue.GetFailedJobs() + return err == nil && len(failedJobs) == 3 + }, 5*time.Second, 50*time.Millisecond) + require.NoError(t, err)

Also applies to: 101-103, 149-150

🤖 Prompt for AI Agents

In `@failed_jobs_test.go` around lines 49 - 50, Replace the fixed time.Sleep(500 * time.Millisecond) calls in failed_jobs_test.go with an eventual/polling assertion: repeatedly check the expected condition (e.g., call the same helper that inspects job state such as listFailedJobs/getJobStatus or the DB/query used elsewhere in this test) at short intervals until it becomes true or a reasonable timeout elapses, failing the test on timeout; do this for the three sleep sites (the Sleep at the shown diff and the other occurrences around lines noted 101-103 and 149-150) so tests wait deterministically for the expected state instead of sleeping a fixed duration.

queue.go

- Add GetFailedJobs() to retrieve all failed jobs from Redis - Add GetFailedJob(jobId) to get a specific failed job's error - Add ClearFailedJobs() to remove all failed job records - Add scanFailedJobKeys() helper to eliminate code duplication - Reuse existing Job struct instead of creating new FailedJobInfo - Add comprehensive unit tests in failed_jobs_test.go Failed jobs are stored in Redis after exhausting all retries. These methods allow listing, inspecting, and clearing failed jobs. Closes #68

ren0503 added this to the Queue v2.1.2 milestone Jan 17, 2026

ren0503 force-pushed the feat/ren/68-list-failed-job-in-redis branch from bdcbc62 to 0664f40 Compare January 17, 2026 03:22

coderabbitai bot reviewed Jan 17, 2026

View reviewed changes

ren0503 force-pushed the feat/ren/68-list-failed-job-in-redis branch from 0664f40 to 435cfa9 Compare January 17, 2026 03:41

ren0503 merged commit a054ff0 into master Jan 17, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add methods to list and manage failed jobs in Redis#69

feat: add methods to list and manage failed jobs in Redis#69
ren0503 merged 1 commit intomasterfrom
feat/ren/68-list-failed-job-in-redis

ren0503 commented Jan 17, 2026

Uh oh!

coderabbitai bot commented Jan 17, 2026 •

edited

Loading

Uh oh!

codecov-commenter commented Jan 17, 2026 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Jan 17, 2026

Uh oh!

coderabbitai bot Jan 17, 2026

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		// Wait a bit for jobs to be processed
		time.Sleep(500 * time.Millisecond)

Conversation

ren0503 commented Jan 17, 2026

Uh oh!

coderabbitai bot commented Jan 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

codecov-commenter commented Jan 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 17, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 17, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

coderabbitai bot commented Jan 17, 2026 •

edited

Loading

codecov-commenter commented Jan 17, 2026 •

edited

Loading