Skip to content

Commit 35489a0

Browse files
FengPan-FrankUbuntu
authored andcommitted
Add design for migrating systemd-managed docker containers to Kubernetes with resource control
1 parent 2c1eb60 commit 35489a0

File tree

1 file changed

+304
-0
lines changed

1 file changed

+304
-0
lines changed

doc/bmp/k8s_migration_design.md

Lines changed: 304 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,304 @@
1+
2+
# Migrating Image-Managed Docker Containers to Kubernetes with Resource Control
3+
4+
## Background
5+
6+
In current SONiC architecture, Many containers are image-managed, which means it's packed into build image and managed by NDM Golden config. And commonly deployed and managed using `systemd` and monitored using tools like `monit`. But after KubeSonic comes into picture, this deolpyment lacks advanced orchestration and native resource management features offered by Kubernetes.
7+
8+
This document outlines a generic approach to migrate any Image-managed Docker container to Kubernetes, providing CPU and memory resource controls, while maintaining backward compatibility with the existing `systemd` workflows. The BMP container (`docker-sonic-bmp`) is used as a concrete example.
9+
10+
## Objective
11+
12+
- Standardize container deployment using Kubernetes, including the image native container which is controlled via NDM golden config FEATURE table.
13+
- Enforce CPU and memory resource constraints natively.
14+
- Maintain `systemd` interface for backward compatibility.
15+
- Optionally integrate existing monitoring systems during transition.
16+
17+
---
18+
19+
## Standardize Kubernetes-Based container Deployment
20+
21+
### image native container migration
22+
23+
Since we need migration from a image-managed container to a Kubernetes-managed container, while preserving compatibility and avoiding dual-running instances.
24+
25+
There are some potential options as below:
26+
27+
### One-Time Migration Step
28+
Define a Kubernetes pre-deployment job, which is to detect and stop/remove the native container (e.g., via systemd, Docker, etc.), before enabling the Kubernetes deployment.
29+
Disable any native auto-restart logic (e.g., systemctl disable, docker rm -f && docker rm, etc.). But this may break some existing feature like CriticalProcessHealthChecker, featured, systemHealth, etc.
30+
31+
### Mirror the Real State into the Config Flag
32+
This means we can keep updating the FEATURE table to reflect Kubernetes’ Real state
33+
34+
-When the K8s bmp Deployment enabled → set FEATURE|bmp "state = enabled"
35+
-When the k8s Deployment is rollbacked → set FEATURE|bmp "state = disabled"
36+
37+
We can Implement this as Kubernetes-based CronJob as below:
38+
39+
```yaml
40+
apiVersion: batch/v1
41+
kind: CronJob
42+
metadata:
43+
name: sync-bmp-status
44+
namespace: sonic
45+
spec:
46+
schedule: "*/1 * * * *" # every 1 minute
47+
jobTemplate:
48+
spec:
49+
template:
50+
spec:
51+
containers:
52+
- name: redis-updater
53+
image: redis:latest
54+
command: ["/bin/sh", "-c"]
55+
args:
56+
- |
57+
replicas=$(kubectl get deploy bmp -n sonic -o jsonpath='{.spec.replicas}')
58+
if [ "$replicas" -gt 0 ]; then
59+
redis-cli HMSET FEATURE|bmp state enabled
60+
else
61+
redis-cli HMSET FEATURE|bmp state disabled
62+
fi
63+
env:
64+
- name: REDIS_HOST
65+
value: my.redis.host
66+
restartPolicy: OnFailure
67+
68+
```
69+
70+
### Enforce CPU and memory resource constraints natively.
71+
72+
Kubernetes provides native resource management through the `resources` spec, allowing you to define minimum (`requests`) and maximum (`limits`) values for CPU and memory.
73+
74+
### Example Deployment YAML (Generic)
75+
76+
```yaml
77+
apiVersion: apps/v1
78+
kind: Deployment
79+
metadata:
80+
name: <container-name>
81+
namespace: <namespace>
82+
spec:
83+
replicas: 1
84+
selector:
85+
matchLabels:
86+
app: <container-name>
87+
template:
88+
metadata:
89+
labels:
90+
app: <container-name>
91+
spec:
92+
containers:
93+
- name: <container-name>
94+
image: <container-image>
95+
command: ["<startup-command>"]
96+
resources:
97+
requests:
98+
memory: "100Mi"
99+
cpu: "100m"
100+
limits:
101+
memory: "800Mi"
102+
cpu: "500m"
103+
ports:
104+
- containerPort: <port>
105+
livenessProbe:
106+
exec:
107+
command: ["/usr/bin/pgrep", "<main-process>"]
108+
initialDelaySeconds: 60
109+
periodSeconds: 30
110+
readinessProbe:
111+
exec:
112+
command: ["/usr/bin/pgrep", "<main-process>"]
113+
initialDelaySeconds: 30
114+
periodSeconds: 15
115+
```
116+
117+
### Example: BMP Container
118+
119+
```yaml
120+
apiVersion: apps/v1
121+
kind: Deployment
122+
metadata:
123+
name: bmp
124+
namespace: sonic
125+
spec:
126+
replicas: 1
127+
selector:
128+
matchLabels:
129+
app: bmp
130+
template:
131+
metadata:
132+
labels:
133+
app: bmp
134+
spec:
135+
containers:
136+
- name: bmp
137+
image: ksdatatest.azurecr.io/docker-sonic-bmp:latest
138+
command: ["/usr/local/bin/supervisord"]
139+
resources:
140+
requests:
141+
memory: "100Mi"
142+
cpu: "100m"
143+
limits:
144+
memory: "800Mi"
145+
cpu: "500m"
146+
ports:
147+
- containerPort: 5000
148+
livenessProbe:
149+
exec:
150+
command: ["/usr/bin/pgrep", "openbmpd"]
151+
initialDelaySeconds: 60
152+
periodSeconds: 30
153+
readinessProbe:
154+
exec:
155+
command: ["/usr/bin/pgrep", "openbmpd"]
156+
initialDelaySeconds: 30
157+
periodSeconds: 15
158+
```
159+
160+
---
161+
162+
## Maintaining `systemd` Compatibility
163+
164+
In environments where existing operational workflows depend on managing containers via systemd, we can preserve compatibility by implementing a proxy systemd unit that interacts with Kubernetes behind the scenes. This allows existing automation tools and scripts that call systemctl to continue functioning without modification, even though the container is now orchestrated by Kubernetes.
165+
166+
### Rationale
167+
168+
169+
Many production systems have monitoring, automation, or recovery mechanisms that depend on:
170+
- `systemctl start <service>`
171+
- `systemctl stop <service>`
172+
- `systemctl status <service>`
173+
174+
To prevent breaking these expectations during the migration, a `systemd` service stub can be provided.
175+
176+
177+
### Step-by-Step Setup
178+
179+
#### 1. Create Wrapper Script
180+
181+
Create a script `/usr/local/bin/k8s-wrapper.sh` that translates `systemd`-style commands to Kubernetes `kubectl` actions:
182+
183+
```bash
184+
#!/bin/bash
185+
set -e
186+
187+
NAME="$1"
188+
ACTION="$2"
189+
NAMESPACE="default"
190+
191+
if [[ -z "$NAME" || -z "$ACTION" ]]; then
192+
echo "Usage: $0 <container-name> {start|stop|restart|status}"
193+
exit 1
194+
fi
195+
196+
case "$ACTION" in
197+
start)
198+
echo "[INFO] Scaling $NAME deployment to 1"
199+
kubectl scale deployment "$NAME" --replicas=1 -n "$NAMESPACE"
200+
;;
201+
stop)
202+
echo "[INFO] Scaling $NAME deployment to 0"
203+
kubectl scale deployment "$NAME" --replicas=0 -n "$NAMESPACE"
204+
;;
205+
restart)
206+
echo "[INFO] Restarting $NAME deployment"
207+
kubectl rollout restart deployment "$NAME" -n "$NAMESPACE"
208+
;;
209+
status)
210+
echo "[INFO] Getting pod status for $NAME"
211+
kubectl get pods -l app="$NAME" -n "$NAMESPACE" -o wide
212+
;;
213+
*)
214+
echo "Usage: $0 <container-name> {start|stop|restart|status}"
215+
exit 1
216+
esac
217+
```
218+
219+
Make the script executable:
220+
221+
```bash
222+
chmod +x /usr/local/bin/k8s-wrapper.sh
223+
```
224+
225+
226+
#### 2. Create systemd Unit File
227+
228+
Example unit file: /etc/systemd/system/bmp.service
229+
230+
```
231+
[Unit]
232+
Description=Kubernetes managed container bmp
233+
After=network.target
234+
235+
[Service]
236+
Type=oneshot
237+
ExecStart=/usr/local/bin/k8s-wrapper.sh bmp start
238+
ExecStop=/usr/local/bin/k8s-wrapper.sh bmp stop
239+
ExecReload=/usr/local/bin/k8s-wrapper.sh bmp restart
240+
RemainAfterExit=yes
241+
242+
[Install]
243+
WantedBy=multi-user.target
244+
```
245+
246+
#### 3. Reload systemd and Enable the Stub Service
247+
```
248+
249+
sudo systemctl daemon-reexec
250+
sudo systemctl daemon-reload
251+
sudo systemctl enable bmp.service
252+
```
253+
254+
```
255+
256+
sudo systemctl start bmp.service
257+
sudo systemctl status bmp.service
258+
sudo systemctl stop bmp.service
259+
```
260+
261+
### Limitations
262+
263+
264+
systemctl status does not show process PID or exit codes—it proxies Kubernetes pod status.
265+
266+
Restart policies (e.g., Restart=on-failure) defined in systemd will not work—Kubernetes handles restarts via livenessProbe and restartPolicy.
267+
268+
This assumes kubectl is installed and configured to access the correct Kubernetes cluster and namespace.
269+
270+
### Benefits
271+
272+
No disruption to automation or legacy tooling using systemctl.
273+
274+
Operators familiar with systemd can continue using the same commands.
275+
276+
Allows gradual migration to a fully Kubernetes-native setup.
277+
278+
---
279+
280+
## Monitoring and Alerting
281+
282+
Kubernetes supports integrated monitoring using:
283+
- `kubectl top pod` for resource snapshots
284+
- Prometheus and AlertManager for alerting
285+
- Fluentd, Loki, or EFK stack for logging
286+
287+
If legacy tools like `monit` must be retained temporarily, rewrite checks to use Kubernetes data (e.g., via `kubectl top`) instead of Docker or CGroup files.
288+
289+
---
290+
291+
## Migration Strategy
292+
293+
1. **Deploy container in Kubernetes** in a test environment.
294+
2. **Verify application health, logs, and performance.**
295+
3. **Create `systemd` wrapper service** to mimic old interface.
296+
4. **Transition monitoring (if applicable).**
297+
5. **Gradually phase out Monit or Docker-native tools.**
298+
6. **Monitor and document stability in production.**
299+
300+
---
301+
302+
## Conclusion
303+
304+
This document provides a generic, reusable pattern for migrating Docker containers from `systemd` + `monit` to Kubernetes. It ensures modern resource control, while preserving backward compatibility during transition. The BMP container serves as a concrete example of this process.

0 commit comments

Comments
 (0)