[Bug]: GPU health monitor errors on shutdown

### Prerequisites

- [x] I searched existing issues
- [x] I can reproduce this issue

### Bug Description

The GPU health monitor errors out with a Python error on shutdown

### Component

Health Monitor

### Steps to Reproduce

1. Start GPU health monitor
2. Tail logs
3. Kill pod and notice the error

### Environment

- NVSentinel version: 0.3.0
- Kubernetes version: 1.33
- Deployment method: helm


### Logs/Output

```
Traceback (most recent call last):
  File "/usr/local/bin/gpu_health_monitor", line 8, in <module>
    sys.exit(cli())
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1462, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1383, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1246, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 814, in invoke
    return callback(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/gpu_health_monitor/cli.py", line 145, in cli
    dcgm_watcher.start([], exit)
  File "/usr/local/lib/python3.10/dist-packages/gpu_health_monitor/dcgm_watcher/dcgm.py", line 308, in start
    exit.wait(self._poll_interval_seconds)
  File "/usr/lib/python3.10/threading.py", line 607, in wait
    signaled = self._cond.wait(timeout)
  File "/usr/lib/python3.10/threading.py", line 324, in wait
    gotit = waiter.acquire(True, timeout)
TypeError: cli.<locals>.process_exit_signal() takes 0 positional arguments but 2 were given
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug]: GPU health monitor errors on shutdown #365

Prerequisites

Bug Description

Component

Steps to Reproduce

Environment

Logs/Output

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug]: GPU health monitor errors on shutdown #365

Description

Prerequisites

Bug Description

Component

Steps to Reproduce

Environment

Logs/Output

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions