Prerequisites
Bug Description
When the health monitors cannot publish to the UDS socket after all the retries have been exhausted, it doesn't exit with an error and in some cases, this causes drops in health messages.
The health monitor should exit with status code 1 if all retires for the publish have failed.
Component
Health Monitor
Steps to Reproduce
- Start the health monitor
- Disable the platform connector on that host
- Trigger a health event
Environment
- NVSentinel version: all
- Kubernetes version: all
- Deployment method: helm
Logs/Output
No response