[Feature]: Support for rack "management"

### Prerequisites

- [x] I searched existing issues

### Feature Summary

In GB200 or other rack type deployment, we need to be able to drain an entire rack when sufficient number of nodes go bad. Currently NVSentinel does a good job of identifying bad nodes in a rack, but it doesn't "escalate" it to a rack level issue when there are persistent bad nodes in a rack.

### Problem/Use Case

As an operator I want the ability to remediate an entire rack when 50+% of the nodes in a rack are bad due to switch/other errors.

### Proposed Solution

TBD

### Component

Fault Management

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature]: Support for rack "management" #367

Prerequisites

Feature Summary

Problem/Use Case

Proposed Solution

Component

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature]: Support for rack "management" #367

Description

Prerequisites

Feature Summary

Problem/Use Case

Proposed Solution

Component

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions