-
Notifications
You must be signed in to change notification settings - Fork 11
HDDS-11618. Enable HA modes for OM and SCM #10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 3 commits
d5a2af0
8c2606c
d7e4322
4c27e2d
d60087d
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -28,6 +28,10 @@ spec: | |
| ports: | ||
| - name: ui | ||
| port: {{ .Values.om.service.port }} | ||
| {{- if gt (int .Values.om.replicas) 1 }} | ||
| - name: ratis | ||
| port: 9872 | ||
| {{- end }} | ||
|
||
| selector: | ||
| {{- include "ozone.selectorLabels" . | nindent 4 }} | ||
| app.kubernetes.io/component: om | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -31,6 +31,7 @@ metadata: | |
| app.kubernetes.io/component: om | ||
| spec: | ||
| replicas: {{ .Values.om.replicas }} | ||
| podManagementPolicy: Parallel | ||
|
||
| serviceName: {{ .Release.Name }}-om-headless | ||
| selector: | ||
| matchLabels: | ||
|
|
@@ -71,6 +72,10 @@ spec: | |
| containerPort: 9862 | ||
| - name: ui | ||
| containerPort: {{ .Values.om.service.port }} | ||
| {{- if gt (int .Values.om.replicas) 1 }} | ||
| - name: ratis | ||
| containerPort: 9872 | ||
| {{- end }} | ||
| livenessProbe: | ||
| httpGet: | ||
| path: / | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -31,6 +31,7 @@ metadata: | |
| app.kubernetes.io/component: scm | ||
| spec: | ||
| replicas: {{ .Values.scm.replicas }} | ||
| podManagementPolicy: Parallel | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is a
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see a different problem: If we use |
||
| serviceName: {{ .Release.Name }}-scm-headless | ||
| selector: | ||
| matchLabels: | ||
|
|
@@ -61,6 +62,24 @@ spec: | |
| mountPath: {{ .Values.configuration.dir }} | ||
| - name: {{ .Release.Name }}-scm | ||
| mountPath: {{ .Values.scm.persistence.path }} | ||
| {{- if gt (int .Values.scm.replicas) 1 }} | ||
| - name: bootstrap | ||
| image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}" | ||
| args: ["ozone", "scm", "--bootstrap"] | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. From the SCM HA docs:
Here, we call It is not clear from the documentation whether There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It appears from the documentation that
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I used the following doc:
This can be found in the Auto-bootstrap section here. I understood that we can run this commands on every instance and it will automatically detect what to do if
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I used this because of the ratis ring. The problem with the other configuration is that the hosts of unstated nodes from the headless service cannot be resolved. So the first SCM node cannot start because it depends on the resolution of the other ones, which again cannot start because they start only if first node has finished successfully
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
However, this is only relevant for the bootstrap process. So, you might be right. We probably only need the bootstrap once in persistent mode. Maybe someone from the Ozone contributors can provide a definitive answer to this question. |
||
| env: | ||
| {{- include "ozone.configuration.env" . | nindent 12 }} | ||
| {{- with $env }} | ||
| {{- tpl (toYaml .) $ | nindent 12 }} | ||
| {{- end }} | ||
| {{- with $envFrom }} | ||
| envFrom: {{- tpl (toYaml .) $ | nindent 12 }} | ||
| {{- end }} | ||
| volumeMounts: | ||
| - name: config | ||
| mountPath: {{ .Values.configuration.dir }} | ||
| - name: {{ .Release.Name }}-scm | ||
| mountPath: {{ .Values.scm.persistence.path }} | ||
| {{- end }} | ||
| containers: | ||
| - name: scm | ||
| image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}" | ||
|
|
@@ -82,10 +101,18 @@ spec: | |
| ports: | ||
| - name: rpc-client | ||
| containerPort: 9860 | ||
| - name: block-client | ||
| containerPort: 9863 | ||
| - name: rpc-datanode | ||
| containerPort: 9861 | ||
| - name: ui | ||
| containerPort: {{ .Values.scm.service.port }} | ||
| {{- if gt (int .Values.scm.replicas) 1 }} | ||
| - name: ratis | ||
| containerPort: 9894 | ||
| - name: grpc | ||
| containerPort: 9895 | ||
| {{- end }} | ||
|
Comment on lines
+110
to
+115
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Since we are exposing ports, we should also have a readiness probe to toggle access to these ports in a separate Jira. |
||
| livenessProbe: | ||
| httpGet: | ||
| path: / | ||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can the port numbers in all the services and statefulsets be referenced from the
values.yamlfile? We would like to avoid hardcoding them.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes definitely! I just proposed this. This was also my personal question ^^