Skip to content

Commit cbc79a9

Browse files
authored
docs: add documentation about the different statuses, stages, and state (#74)
1 parent e006835 commit cbc79a9

File tree

3 files changed

+85
-0
lines changed

3 files changed

+85
-0
lines changed

README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -127,6 +127,8 @@ operator to know which way the package is going while also enforcing best versio
127127

128128
**For detailed information about our versioning strategy, git tagging conventions, and component release process, see [docs/versioning.md](docs/versioning.md) and [docs/release-process.md](docs/release-process.md).**
129129

130+
**For definitions of Status, State, and Stage concepts used throughout the operator, see [docs/operator-status-definitions.md](docs/operator-status-definitions.md).**
131+
130132
## Packages
131133
Part of how the operator works is the [skyhook-agent](agent/README.md). Packages have to be created in way so the operator knows how to use them. This is where the agent comes into play, more on that later. A package is a container that meets these requirements:
132134

docs/README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,8 @@ This directory contains user and operator documentation for Skyhook. Here you'll
2222

2323
- [Operator Resources At Scale](operator_resources_at_scale.md): Considerations for how cpu and memory have to change for the Operator pods as cluster nodes and skyhook packages change.
2424

25+
- [Operator Status Definitions](operator-status-definitions.md): Definitions of Status, State, and Stage concepts used throughout the Skyhook operator.
26+
2527
- **Process**
2628
- [Releases](releases.md):
2729
Release notes and upgrade information for Skyhook.
Lines changed: 81 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,81 @@
1+
# Operator Status, State, and Stage Definitions
2+
3+
This document provides concise definitions for the status, state, and stage concepts used throughout the Skyhook operator to track package operations and node lifecycle management.
4+
5+
## Key Relationships
6+
7+
- **Status** reflects the overall health and progress of nodes and the Skyhook resource
8+
- **State** tracks the execution status of individual package operations
9+
- **Stage** defines the specific lifecycle phase a package is currently in
10+
11+
- A node's Status is derived from the collective States of its packages
12+
- Stages progress sequentially, with State indicating success/failure at each stage
13+
- All stages except for interrupts include validation checks that must succeed for progression
14+
15+
## Usage in Operations
16+
17+
- **Monitoring**: Use Status for high-level health checks and dashboards
18+
- **Debugging**: Examine State and Stage for detailed package-level troubleshooting
19+
- **Automation**: State transitions trigger the next appropriate Stage in the lifecycle
20+
- **Scheduling**: Status values like `blocked` and `paused` control operation scheduling and dependencies
21+
22+
## Status
23+
24+
**Scope**: Applied to the overall Skyhook resource and individual nodes
25+
**Purpose**: High-level operational status indicating the current condition
26+
27+
| Status | Definition |
28+
|--------|------------|
29+
| `complete` | All operations have finished successfully |
30+
| `blocked` | Operations are prevented from proceeding due to taint toleration issues |
31+
| `waiting` | Queued for execution but not yet started |
32+
| `disabled` | Execution is disabled but will continue for other Skyhooks |
33+
| `paused` | Execution is paused for this and all other Skyhooks supposed to be executed after this one |
34+
| `in_progress` | Currently executing operations |
35+
| `erroring` | Experiencing failures or errors |
36+
| `unknown` | Status cannot be determined or is uninitialized |
37+
38+
## State
39+
40+
**Scope**: Applied to individual packages within a node
41+
**Purpose**: Current execution state of a specific package operation
42+
43+
| State | Definition |
44+
|-------|------------|
45+
| `complete` | Package operation has finished successfully |
46+
| `in_progress` | Package is actively running (pod has started) |
47+
| `skipped` | Package/stage was intentionally bypassed in the lifecycle |
48+
| `erroring` | Package operation is experiencing failures |
49+
| `unknown` | Package state cannot be determined or is uninitialized |
50+
51+
## Stage
52+
53+
**Scope**: Applied to individual packages
54+
**Purpose**: Indicates which phase of the package installation/management process is currently executing
55+
56+
| Stage | Definition |
57+
|-------|------------|
58+
| `uninstall` & `uninstall-check` | Removal of the package |
59+
| `upgrade` & `upgrade-check` | Package version update operations |
60+
| `apply` & `apply-check` | Initial installation/deployment of the package |
61+
| `config` & `config-check` | Configuration and setup operations |
62+
| `interrupt` | Execution of interrupt operations (e.g., reboots, service restarts) |
63+
| `post-interrupt` & `post-interrupt-check` | Operations that run after interrupt completion |
64+
65+
**NOTE**: All stages except for interrupts include validation checks that must succeed for progression
66+
67+
## Stage Flow
68+
69+
The typical stage progression depends on whether the package has interrupts:
70+
71+
### Without Interrupts:
72+
```
73+
uninstall → apply → config
74+
upgrade → config
75+
```
76+
77+
### With Interrupts:
78+
```
79+
uninstall → apply → config → interrupt → post-interrupt
80+
upgrade → config → interrupt → post-interrupt
81+
```

0 commit comments

Comments
 (0)