-
Notifications
You must be signed in to change notification settings - Fork 104
chore: add docker healthcheck for worker and server #332
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughAdds two HTTP health endpoints ( Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Docker as Docker/Compose
participant Container as App Container
participant Rails as Rails App
participant DB as Database
participant Redis as Redis
Note over Docker,Container `#DDEBF7`: Container health probe (Dockerfile / compose)
Docker->>Container: HTTP GET /healthcheck
alt 200 OK
Container->>Docker: 200 OK
else non-200 / no response
Container->>Docker: failure
end
Note over Container,Rails `#F6F8E9`: Rails healthcheck handling
Container->>Rails: GET /healthcheck
Rails->>DB: SELECT 1
DB-->>Rails: OK / error
Rails->>Redis: PING
Redis-->>Rails: PONG / error
Rails-->>Container: JSON {db:, redis:, ok:}, status 200 or 503
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes
Possibly related PRs
Suggested labels
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
Dockerfile(2 hunks)compose.example.yml(2 hunks)
🔇 Additional comments (2)
Dockerfile (1)
61-62: HEALTHCHECK configuration is well-configured.The healthcheck correctly probes the application on localhost:3000 using curl with the --fail flag, and the timing parameters (30s interval, 5s timeout, 10s start-period, 3 retries) are sensible defaults. The curl binary is available in the final stage since it's installed in the base image (line 12).
compose.example.yml (1)
83-88: Worker service healthcheck is well-configured.The process-based health check using
pgrep -f sidekiqis a reasonable approach to monitor the Sidekiq worker, and the timing parameters (30s interval, 5s timeout, 3 retries, 10s start-period) are appropriate for this type of check.
|
Thanks for sending this in, @xBlaz3kx! Love all the small "quality of life" improvements like this ... My main concern with this is simple, but I think you'll appreciate given your line of work: logs become a mess when you do
So you can see that I at least moved Want to rewrite it like that? Not sure what a good "compose friendly" way of doing it would be. 🤷♂️ |
jjmata
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets decide if we go all the way to an HTTP GET health check or we can do something more nimble.
| # Entrypoint prepares the database. | ||
| ENTRYPOINT ["/rails/bin/docker-entrypoint"] | ||
|
|
||
| HEALTHCHECK --interval=30s --timeout=5s --start-period=10s --retries=3 \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than curl a full HTTP request can we do a TCP check to make sure the container is up? Always hard to decide how far to decide "healthy" app ... roundtrip to the DB included?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usually, you have a liveliness check and a healthcheck endpoint. Liveliness determines if the application is up and running. Health check endpoint would check if its running as expected (connected to db, message queues, etc).
I'm not faimiliar with Ruby, so I couldnt dig up the appropriate endpoint to use. Perhaps OpenAPI spec would be a nice addition to the docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For compose, we can just use the liveliness endpoint, since it will most likely not boot if migrations arent applied correctly. For Kubernetes, migrations should probably be ran before the main application startup (in a separate, sidecar container).
|
I agree, polluting logs with healtchecks is a nightmare, specially when hitting the There are other ways to go around this, for example if you are using Promtail, you could always filter out logs that are hitting the healtcheck endpoint. Ill try to find a better solution for the healthcheck, maybe I can add dedicated health endpoints, and we can skip logs on those. |
49fbcaf to
226dda2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
Dockerfile (1)
61-62: Consider using/livelinessendpoint to reduce log pollution.Based on the PR discussion, HTTP healthcheck log pollution is a concern. The
/livelinessendpoint would be more appropriate for the Docker HEALTHCHECK because:
- It silences logs (via
silence_loggerin the controller), addressing the log pollution concern raised by the maintainer- It's a lightweight check that only verifies the app is running (appropriate for container-level liveness)
- The
/healthcheckendpoint performs deeper checks (database, Redis) that may be unnecessarily heavy for frequent container health probesApply this diff to use the liveliness endpoint:
-HEALTHCHECK --interval=30s --timeout=5s --start-period=10s --retries=3 \ - CMD curl --fail http://localhost:3000/healthcheck || exit 1 +HEALTHCHECK --interval=30s --timeout=5s --start-period=10s --retries=3 \ + CMD curl --fail http://localhost:3000/liveliness || exit 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
Dockerfile(2 hunks)app/controllers/health_controller.rb(1 hunks)compose.example.yml(2 hunks)config/routes.rb(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- compose.example.yml
🔇 Additional comments (6)
app/controllers/health_controller.rb (5)
1-9: LGTM! Well-designed health endpoint setup.The controller appropriately:
- Skips authentication and CSRF for health endpoints
- Applies
silence_loggeronly to/livelinessto reduce log noise from frequent container probes- Keeps logging enabled for
/healthcheckso that actual health failures are visibleThis design aligns with the PR discussion about minimizing log pollution while maintaining observability for real issues.
11-21: LGTM! Proper healthcheck implementation.The healthcheck action correctly:
- Aggregates multiple dependency checks (database and Redis)
- Returns appropriate HTTP status codes (200 OK vs 503 Service Unavailable)
- Provides detailed JSON response for debugging
23-25: LGTM! Lightweight liveliness probe.The liveliness endpoint appropriately returns a simple 200 OK with no body, suitable for frequent container health checks.
29-35: LGTM! Database check is correct.The database check performs a lightweight
SELECT 1query and properly handles errors.
48-54: LGTM! Proper logger silencing implementation.The
silence_loggerhelper correctly preserves the original log level and restores it in an ensure block.config/routes.rb (1)
336-338: LGTM! Health routes properly configured.The new routes are correctly placed alongside the existing Rails health check route and properly map to the HealthController actions.
226dda2 to
ae41499
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (3)
app/controllers/health_controller.rb (3)
23-25: Simple and appropriate liveness probe.The minimal implementation is ideal for a liveness endpoint that will be polled frequently by monitoring tools.
Optional: Consider naming convention. Kubernetes uses "liveness" rather than "liveliness" for probe endpoints. While "liveliness" is valid, "liveness" is more conventional in the container orchestration ecosystem.
37-43: Past review concern addressed correctly.The implementation now uses
Sidekiq.redis(&:ping)which correctly leverages the application's configured Redis connection pool, addressing the previous review feedback.Consider broadening exception handling for consistency. The
check_databasemethod catchesStandardError, but this method only catchesRedis::CannotConnectError. Other Redis errors (timeouts, connection errors, etc.) might not be caught. For consistency and defensive coding, consider catchingStandardErrorhere as well.Apply this diff for consistency with
check_database:def check_redis Sidekiq.redis(&:ping) true - rescue Redis::CannotConnectError => e + rescue StandardError => e Rails.logger.error("Redis health check failed: #{e.message}") false endBased on learnings
45-51: Correct implementation with proper cleanup.The manual logger level manipulation with ensure block correctly silences logs and guarantees restoration of the original level.
Optional: Rails has built-in log silencing. While your implementation works well, Rails provides
LoggerSilenceconcern with asilencemethod. However, your explicit implementation is clear and correct.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
Dockerfile(2 hunks)app/controllers/health_controller.rb(1 hunks)compose.example.yml(2 hunks)config/routes.rb(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- compose.example.yml
- Dockerfile
🧰 Additional context used
🪛 GitHub Check: ci / lint
app/controllers/health_controller.rb
[failure] 53-53:
Layout/TrailingEmptyLines: 1 trailing blank lines detected.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: ci / test
🔇 Additional comments (5)
app/controllers/health_controller.rb (4)
1-6: LGTM! Appropriate callback skips for health endpoints.Skipping authentication and setup callbacks is correct for health check endpoints, ensuring they remain fast and accessible for monitoring tools.
8-9: Good solution to the log pollution concern.Silencing logs only for the liveliness endpoint addresses the issue raised in the PR discussion about healthcheck-induced log clutter while keeping detailed logging for the healthcheck endpoint.
11-21: LGTM! Standard healthcheck pattern with proper status codes.The implementation correctly aggregates individual health checks and returns appropriate HTTP status codes (200 for healthy, 503 for unhealthy) along with detailed JSON for debugging.
29-35: LGTM! Efficient database health check.Using
SELECT 1is an efficient and standard approach for database health checks. The broad exception handling with error logging is appropriate.config/routes.rb (1)
336-338: LGTM! Well-placed and correctly configured routes.The new health check routes are properly defined and logically placed near the existing Rails health check endpoint, making them easy to find and maintain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
app/controllers/health_controller.rb (1)
37-43: Consider broader error handling for consistency.The
check_redismethod only rescuesRedis::CannotConnectError, whilecheck_databaserescues the broaderStandardError. Redis operations can raise other exceptions such asRedis::TimeoutError,Redis::ConnectionError, orRedis::CommandError. If these occur, they'll propagate up and cause a 500 error instead of being handled gracefully.For consistency and resilience, consider rescuing
StandardErrorlikecheck_databasedoes:def check_redis Sidekiq.redis(&:ping) true - rescue Redis::CannotConnectError => e + rescue StandardError => e Rails.logger.error("Redis health check failed: #{e.message}") false end
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
app/controllers/health_controller.rb(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: ci / test
🔇 Additional comments (3)
app/controllers/health_controller.rb (3)
2-6: LGTM! Appropriate exemptions for health endpoints.Skipping authentication, CSRF protection, and request tracking callbacks is correct for health check endpoints that need to be accessible by container orchestration tools without credentials.
8-21: Verify logging strategy for the healthcheck endpoint.The
silence_loggerwrapper is only applied to thelivelinessendpoint, but thehealthcheckendpoint (which returns detailed JSON status) will still generate log entries on every Docker healthcheck poll. Based on the PR discussion where log pollution was a primary concern, shouldhealthcheckalso be included in thesilence_loggeraround_action?If frequent detailed healthcheck logging is intentional (e.g., to track failures in application logs), this is fine. However, if Docker polls
/healthcheckevery few seconds, it will create the same log clutter issue that was raised in the PR comments.Consider applying silence_logger to both endpoints if log pollution is a concern:
- around_action :silence_logger, only: :liveliness + around_action :silence_logger, only: [:liveliness, :healthcheck]
45-51: LGTM! Solid logger silencing implementation.The use of
ensureto restore the logger level is correct, andLogger::UNKNOWNis the appropriate level to suppress all log output. This effectively addresses the log pollution concern raised in the PR discussion.
f477ea1 to
0a45abf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
app/controllers/health_controller.rb (1)
37-43: Redis check only covers Sidekiq's Redis connection.The implementation correctly uses
Sidekiq.redis(&:ping)(as addressed in previous reviews). However, this only verifies the Redis connection configured for Sidekiq. If your application uses Redis for other purposes—such as caching, sessions, or ActionCable—those connections won't be verified by this health check.For most applications where Redis is primarily used with Sidekiq, this implementation is sufficient. If you need to check other Redis connections, you may want to add additional checks for those specific Redis clients.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
app/controllers/health_controller.rb(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: ci / test
🔇 Additional comments (2)
app/controllers/health_controller.rb (2)
8-9: Consider silencing logs for healthcheck endpoint too.The
silence_loggeris currently applied only to thelivelinessendpoint. However, based on the Dockerfile changes in this PR, Docker's HEALTHCHECK will hit the/healthcheckendpoint periodically (typically every 30 seconds by default). This may still generate log pollution—the issue that motivated this PR.If the
/healthcheckendpoint is intended for detailed monitoring while/livelinessis for frequent probes, the current setup is appropriate. Otherwise, consider extending log silencing to both endpoints or using the/livelinessendpoint for Docker HEALTHCHECK instead.Would you like to verify which endpoint Docker HEALTHCHECK should use, or would you prefer to extend silence_logger to both endpoints?
1-52: LGTM! Clean implementation addressing PR objectives.The implementation provides two well-designed health endpoints:
/healthcheck: Detailed status with database and Redis checks, suitable for comprehensive health monitoring/liveliness: Lightweight probe with silenced logging, ideal for frequent Kubernetes liveness checksThe code correctly:
- Skips authentication and CSRF for unauthenticated health endpoints
- Uses proper error handling with logging
- Implements safe logger silencing with an
ensureblock- Addresses previous review feedback (using
Sidekiq.redis(&:ping)instead ofRedis.new)This successfully addresses the log pollution concerns raised in issue #269 and the PR discussion.
|
So I discovered DHH took care of this in Rails 8 ... why don't we wait until we upgrade to it? I was planning on doing it after |

Addressing #269
Summary by CodeRabbit
New Features
Chores
✏️ Tip: You can customize this high-level summary in your review settings.