Skip to content

Commit f2aa810

Browse files
committed
chore: drop unnecessary prefix and suffix
1 parent c8185ce commit f2aa810

File tree

201 files changed

+15763
-1201
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

201 files changed

+15763
-1201
lines changed

.github/copilot-instructions.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -23,15 +23,15 @@ NVSentinel is a GPU Node Resilience System for Kubernetes that automatically det
2323
│ ├── csp-health-monitor/ # Cloud provider health checks (Go)
2424
│ └── syslog-health-monitor/ # System log analysis (Go)
2525
├── health-events-analyzer/ # Event classification and routing
26-
├── fault-quarantine-module/ # Node isolation (cordon)
27-
├── node-drainer-module/ # Workload eviction
28-
├── fault-remediation-module/ # Break-fix automation
29-
├── labeler-module/ # Node labeling (DCGM version, driver status, Kata detection)
26+
├── fault-quarantine/ # Node isolation (cordon)
27+
├── node-drainer/ # Workload eviction
28+
├── fault-remediation/ # Break-fix automation
29+
├── labeler/ # Node labeling (DCGM version, driver status, Kata detection)
3030
├── janitor/ # State cleanup and maintenance
3131
├── platform-connectors/ # CSP integration (GCP, AWS, Azure)
3232
├── commons/ # Shared utilities
3333
├── data-models/ # Protocol Buffer definitions
34-
├── store-client-sdk/ # MongoDB client library
34+
├── store-client/ # MongoDB client library
3535
└── distros/kubernetes/ # Helm charts
3636
```
3737

@@ -86,7 +86,7 @@ NVSentinel is a GPU Node Resilience System for Kubernetes that automatically det
8686
make lint-test-all
8787

8888
# Lint specific module
89-
cd labeler-module && make lint
89+
cd labeler && make lint
9090

9191
# Test specific module
9292
cd health-events-analyzer && make test
@@ -112,7 +112,7 @@ tilt up # Start local development environment
112112

113113
## Kata Containers Detection
114114

115-
The labeler-module implements Kata Containers detection:
115+
The labeler implements Kata Containers detection:
116116

117117
### Detection Architecture
118118
- **Input labels** (on nodes): `katacontainers.io/kata-runtime` (default) + optional custom label
@@ -233,7 +233,7 @@ poetry update
233233
```
234234

235235
### Adding New Node Labels
236-
1. Update labeler-module logic in `pkg/labeler/labeler.go`
236+
1. Update labeler logic in `pkg/labeler/labeler.go`
237237
2. Add tests in `labeler_test.go`
238238
3. Document in Helm chart values
239239
4. Update KATA_TESTING.md if kata-related

.github/dependabot.yml

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -53,7 +53,7 @@ updates:
5353
labels: ["dependencies"]
5454

5555
- package-ecosystem: "gomod"
56-
directory: "/store-client-sdk"
56+
directory: "/store-client"
5757
target-branch: "main"
5858
schedule:
5959
interval: "weekly"
@@ -68,28 +68,28 @@ updates:
6868

6969
# Fault Management
7070
- package-ecosystem: "gomod"
71-
directory: "/fault-quarantine-module"
71+
directory: "/fault-quarantine"
7272
target-branch: "main"
7373
schedule:
7474
interval: "weekly"
7575
labels: ["dependencies"]
7676

7777
- package-ecosystem: "gomod"
78-
directory: "/fault-remediation-module"
78+
directory: "/fault-remediation"
7979
target-branch: "main"
8080
schedule:
8181
interval: "weekly"
8282
labels: ["dependencies"]
8383

8484
- package-ecosystem: "gomod"
85-
directory: "/labeler-module"
85+
directory: "/labeler"
8686
target-branch: "main"
8787
schedule:
8888
interval: "weekly"
8989
labels: ["dependencies"]
9090

9191
- package-ecosystem: "gomod"
92-
directory: "/node-drainer-module"
92+
directory: "/node-drainer"
9393
target-branch: "main"
9494
schedule:
9595
interval: "weekly"
@@ -112,7 +112,7 @@ updates:
112112
labels: ["dependencies"]
113113

114114
- package-ecosystem: "pip"
115-
directory: "/nvsentinel-log-collector"
115+
directory: "/log-collector"
116116
target-branch: "main"
117117
schedule:
118118
interval: "weekly"

.github/workflows/container-build-test.yml

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -61,9 +61,9 @@ jobs:
6161
make_command: 'make -C health-monitors/syslog-health-monitor docker-build'
6262
# Log Collection (Docker-based)
6363
- component: log-collector
64-
make_command: 'make -C nvsentinel-log-collector docker-build-log-collector'
64+
make_command: 'make -C log-collector docker-build-log-collector'
6565
- component: file-server-cleanup
66-
make_command: 'make -C nvsentinel-log-collector docker-build-file-server-cleanup'
66+
make_command: 'make -C log-collector docker-build-file-server-cleanup'
6767
steps:
6868
- uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
6969

@@ -106,13 +106,13 @@ jobs:
106106
path: .
107107
- module: health-events-analyzer
108108
path: .
109-
- module: fault-quarantine-module
109+
- module: fault-quarantine
110110
path: .
111-
- module: labeler-module
111+
- module: labeler
112112
path: .
113-
- module: node-drainer-module
113+
- module: node-drainer
114114
path: .
115-
- module: fault-remediation-module
115+
- module: fault-remediation
116116
path: .
117117
- module: janitor
118118
path: .

.github/workflows/lint-test.yml

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -62,11 +62,11 @@ jobs:
6262
make_command: 'make gomod-lint'
6363
step_name: 'Run gomod lint'
6464
- component: log-collector
65-
make_command: 'make -C nvsentinel-log-collector lint-log-collector'
65+
make_command: 'make -C log-collector lint-log-collector'
6666
step_name: 'Run lint'
6767
replace_imports: 'false'
6868
- component: file-server-cleanup
69-
make_command: 'make -C nvsentinel-log-collector lint-file-server-cleanup'
69+
make_command: 'make -C log-collector lint-file-server-cleanup'
7070
step_name: 'Run lint'
7171
replace_imports: 'false'
7272
- component: kubernetes-distro
@@ -132,13 +132,13 @@ jobs:
132132
matrix:
133133
component:
134134
- platform-connectors
135-
- store-client-sdk
135+
- store-client
136136
- commons
137137
- health-events-analyzer
138-
- fault-quarantine-module
139-
- labeler-module
140-
- node-drainer-module
141-
- fault-remediation-module
138+
- fault-quarantine
139+
- labeler
140+
- node-drainer
141+
- fault-remediation
142142
- janitor
143143
steps:
144144
- uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0

.github/workflows/publish.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -115,10 +115,10 @@ jobs:
115115
make_command: 'make -C health-monitors/syslog-health-monitor docker-publish'
116116
container_name: 'nvsentinel/syslog-health-monitor'
117117
- component: log-collector
118-
make_command: 'make -C nvsentinel-log-collector docker-publish-log-collector'
118+
make_command: 'make -C log-collector docker-publish-log-collector'
119119
container_name: 'nvsentinel/log-collector'
120120
- component: file-server-cleanup
121-
make_command: 'make -C nvsentinel-log-collector docker-publish-file-server-cleanup'
121+
make_command: 'make -C log-collector docker-publish-file-server-cleanup'
122122
container_name: 'nvsentinel/file-server-cleanup'
123123
steps:
124124
- uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0

.gitignore

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -427,10 +427,10 @@ digests.txt
427427
images.json
428428

429429
# Binary files
430-
fault-quarantine-module/fault-quarantine-module
431-
fault-remediation-module/fault-remediation-module
430+
fault-quarantine/fault-quarantine
431+
fault-remediation/fault-remediation
432432
health-events-analyzer/health-events-analyzer
433433
health-monitors/syslog-health-monitor/syslog-health-monitor
434-
labeler-module/labeler-module
435-
node-drainer-module/node-drainer-module
434+
labeler/labeler
435+
node-drainer/node-drainer
436436
platform-connectors/platform-connectors

.ko.yaml

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -19,8 +19,8 @@ env: [CGO_ENABLED=0]
1919

2020
builds:
2121

22-
- id: fault-quarantine-module
23-
dir: fault-quarantine-module
22+
- id: fault-quarantine
23+
dir: fault-quarantine
2424
main: .
2525
ldflags:
2626
- "-s -w"
@@ -36,8 +36,8 @@ builds:
3636
org.opencontainers.image.revision: "{{.Env.GIT_COMMIT}}"
3737
org.opencontainers.image.created: "{{.Env.BUILD_DATE}}"
3838

39-
- id: fault-remediation-module
40-
dir: fault-remediation-module
39+
- id: fault-remediation
40+
dir: fault-remediation
4141
main: .
4242
ldflags:
4343
- "-s -w"
@@ -104,8 +104,8 @@ builds:
104104
org.opencontainers.image.revision: "{{.Env.GIT_COMMIT}}"
105105
org.opencontainers.image.created: "{{.Env.BUILD_DATE}}"
106106

107-
- id: labeler-module
108-
dir: labeler-module
107+
- id: labeler
108+
dir: labeler
109109
main: .
110110
ldflags:
111111
- "-s -w"
@@ -121,8 +121,8 @@ builds:
121121
org.opencontainers.image.revision: "{{.Env.GIT_COMMIT}}"
122122
org.opencontainers.image.created: "{{.Env.BUILD_DATE}}"
123123

124-
- id: node-drainer-module
125-
dir: node-drainer-module
124+
- id: node-drainer
125+
dir: node-drainer
126126
main: .
127127
ldflags:
128128
- "-s -w"

DEVELOPMENT.md

Lines changed: 15 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -101,14 +101,14 @@ nvsentinel/
101101
│ ├── syslog-health-monitor/ # Go - System log monitoring
102102
│ └── csp-health-monitor/ # Go - Cloud provider monitoring
103103
├── platform-connectors/ # gRPC event ingestion service
104-
├── fault-quarantine-module/ # CEL-based event quarantine logic
105-
├── fault-remediation-module/ # Kubernetes controller for remediation
104+
├── fault-quarantine/ # CEL-based event quarantine logic
105+
├── fault-remediation/ # Kubernetes controller for remediation
106106
├── health-events-analyzer/ # Event analysis and correlation
107107
├── health-event-client/ # Event streaming client
108-
├── labeler-module/ # Node labeling controller
109-
├── node-drainer-module/ # Graceful workload eviction
110-
├── store-client-sdk/ # MongoDB interaction library (tested in CI)
111-
└── nvsentinel-log-collector/ # Log aggregation (shell scripts)
108+
├── labeler/ # Node labeling controller
109+
├── node-drainer/ # Graceful workload eviction
110+
├── store-client/ # MongoDB interaction library (tested in CI)
111+
└── log-collector/ # Log aggregation (shell scripts)
112112
```
113113

114114
### Communication Flow
@@ -356,8 +356,8 @@ make -C health-monitors/gpu-health-monitor docker-build-dcgm3 # DCGM 3.x local
356356
make -C health-monitors/gpu-health-monitor docker-publish-dcgm4 # DCGM 4.x CI
357357

358358
# Container-only module (shell + Python)
359-
make -C nvsentinel-log-collector docker-build-log-collector # Local build
360-
make -C nvsentinel-log-collector docker-publish-log-collector # CI build
359+
make -C log-collector docker-build-log-collector # Local build
360+
make -C log-collector docker-publish-log-collector # CI build
361361
```
362362

363363
#### Module-Level Docker Builds
@@ -379,8 +379,8 @@ make -C health-monitors/gpu-health-monitor docker-build-dcgm4 # DCGM 4.x local
379379
make -C health-monitors/gpu-health-monitor docker-publish # Push both versions
380380

381381
# Container-only module (shell + Python)
382-
make -C nvsentinel-log-collector docker-build # Both log-collector and file-server-cleanup
383-
make -C nvsentinel-log-collector docker-publish # Push both components
382+
make -C log-collector docker-build # Both log-collector and file-server-cleanup
383+
make -C log-collector docker-publish # Push both components
384384

385385
# Legacy compatibility (all modules)
386386
make -C [module] image # Calls docker-build
@@ -605,8 +605,8 @@ global:
605605

606606
2. **Implement MongoDB Change Streams**
607607
```go
608-
// Use store-client-sdk for MongoDB operations
609-
import "github.com/nvidia/nvsentinel/store-client-sdk/pkg/client"
608+
// Use store-client for MongoDB operations
609+
import "github.com/nvidia/nvsentinel/store-client/pkg/client"
610610
```
611611

612612
3. **Add Proper RBAC**
@@ -635,7 +635,7 @@ make go-lint-test-all # All Go modules (common.mk patterns
635635
# Test individual modules via delegation (main Makefile)
636636
make health-events-analyzer-lint-test # Go module
637637
make platform-connectors-lint-test # Go module
638-
make store-client-sdk-lint-test # Go module
638+
make store-client-lint-test # Go module
639639
make log-collector-lint-test # Container module
640640
641641
# Test individual modules directly (common.mk patterns)
@@ -749,7 +749,7 @@ make lint-test-all # Matches lint-test.yml workflow
749749
make -C health-monitors/syslog-health-monitor lint-test
750750
make -C health-monitors/gpu-health-monitor lint-test
751751
make -C platform-connectors lint-test # Uses common.mk patterns
752-
make -C nvsentinel-log-collector lint-test # Shell + Python linting
752+
make -C log-collector lint-test # Shell + Python linting
753753
754754
# Container builds (matches container-build-test.yml)
755755
make -C health-monitors/syslog-health-monitor docker-build
@@ -837,7 +837,7 @@ The CI environment uses:
837837
# Local shellcheck version may differ, causing different linting results
838838
839839
# Use standardized linting (matches GitHub Actions):
840-
make -C nvsentinel-log-collector lint-test # Standardized pattern
840+
make -C log-collector lint-test # Standardized pattern
841841
make log-collector-lint # Main Makefile delegation
842842
843843
# Install shellcheck locally to match CI:

0 commit comments

Comments
 (0)