Skip to content

Commit ade1eb3

Browse files
committed
Configurable cluster replica sizes design doc
1 parent a89795c commit ade1eb3

File tree

1 file changed

+109
-0
lines changed

1 file changed

+109
-0
lines changed
Lines changed: 109 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,109 @@
1+
# Configurable Cluster Replica Sizes
2+
3+
## The Problem
4+
5+
We want users to be able to safely configure cluster replica sizes on runtime for self managed.
6+
7+
> At the moment, we pass the cluster replica size map as a command line argument to environmentd, and this is the only way to specify cluster replica sizes and their metadata. A change to the cluster replica size map requires a restart of environmentd, and patching the Kubernetes stateful set, or whatever deployment mechanism the user has in place. This is a tedious, dangerous, and inefficient process.
8+
>
9+
10+
Currently, cluster sizes are created on bootstrap via CLI arguments, saved in the catalog as `cluster_replica_sizes`, then saved to our builtin tables. On creation of a cluster replica, we save it as a `ReplicaAllocation` object [[code pointer](https://github.com/MaterializeInc/materialize/blob/d37c5be00a29b8b4d5e341f3efac15f9f01a8942/src/adapter/src/catalog/state.rs#L2290)]
11+
12+
## Success Criteria
13+
14+
- Allow a self managed user to configure cluster replica sizes with a schema similar to this code snippet: [https://materialize.com/docs/self-managed/v25.2/sql/appendix-cluster-sizes/#custom-cluster-sizes](https://materialize.com/docs/self-managed/v25.2/sql/appendix-cluster-sizes/#custom-cluster-sizes)
15+
- Allow a user to validate the custom sizes by interacting with the database itself
16+
- Fix potential breaking changes in all clients (i.e. Console)
17+
- Update Terraform providers to use this new functionality
18+
19+
## Out of Scope
20+
21+
- Allow configurable sizes in Cloud
22+
- Propagate modifications of a cluster replica size to all existing replicas using that size
23+
- Documentation of other system variables
24+
25+
## Solution Proposal: Manage cluster replica sizes via a dyncfg
26+
27+
We want to separate system replica sizes from custom user cluster replica sizes where custom user cluster replica sizes come from a dyncfg. The workflow of someone editing a replica size would look something like:
28+
29+
1. Materialize deployments will come with a k8s configmap. This configmap will contain a JSON object called something similar to `custom_user_cluster_replica_sizes` with a shape similar to this [[code snippet](https://materialize.com/docs/self-managed/v25.2/sql/appendix-cluster-sizes/#custom-cluster-sizes)]
30+
1. Edits to the configmap should automatically sync and no restart of environmentd should be required. You can make edits to the configmap either by editing it in Kubenertes directly or syncing it locally with a command like:
31+
```
32+
kubectl get configmap my-configmap -o yaml > my-configmap.yaml
33+
kubectl apply -f my-configmap.yaml
34+
```
35+
36+
1. To verify the cluster replica sizes in the database itself, one can run `SHOW custom_user_cluster_replica_sizes`
37+
1. If the configmap fails to sync, we’ll print out a warning in the logs of the environmentd pod on which field is causing the issue.
38+
- If a cluster size is modified, any existing clusters with that size shouldn’t be affected. Only newly created cluster replicas with the modified cluster size will.
39+
- We can also create a field during the helm chart install to pre-populate this configmap with initial custom cluster replica sizes.
40+
- By default, this configmap will only merge into our dyncfgs and won’t be required.
41+
42+
To achieve this, we’d need to do the following:
43+
44+
### Sync dyncfg values from a file
45+
46+
We currently have functionality to sync dyncfg values from a file: https://github.com/MaterializeInc/materialize/pull/32317
47+
48+
### Orchestrate creation of the file
49+
50+
Similar to the listeners configmap for password auth [[code pointer](https://github.com/MaterializeInc/materialize/blob/v0.151.0/src/orchestratord/src/controller/materialize/environmentd.rs#L1152-L1172)], we can either:
51+
52+
- Allow orchestratord to create a configmap in a volume in environmentd. Then glue the path to the dyncfg synchronization.
53+
- Create a custom cluster size CRD. This will allow orchestratord to handle statefulset creation in the future.
54+
55+
This provides the following positive properties:
56+
57+
- No risk of removal of built in objects that rely on a specific built in cluster size
58+
- Not accessible to cloud by default. We also don’t have to worry about conflicts with our license key check if we automatically set `credits_per_hour` to 0
59+
- Runtime updates via writes to the configmap
60+
61+
The biggest con is it’s unclear how to sync the dyncfg value to our built in catalog tables. I’ve written more about this in the Open Questions section.
62+
63+
## Alternatives
64+
65+
### Utilize `extraArgs` in our Materialize CR by treating it as the source of truth and force a restart of environmentd every time
66+
67+
**Pros:**
68+
69+
- Easiest / quickest to implement. It’d simply be just adding a dyncfg
70+
- Makes it easier to sync the dyncfg value to our built in catalog tables
71+
- Some system variables actually require a restart of environmentd, so this creates a unified interface for all system variables.
72+
73+
**Cons:**
74+
75+
- Can’t edit cluster replica sizes on runtime
76+
77+
### Manage as catalog objects
78+
79+
**Pros:**
80+
81+
- Builtin table updates become easier
82+
83+
**Cons:**
84+
85+
- More time consuming to implement and we lock ourselves into backwards compatibility with this new syntax
86+
- Central source of truth of cluster sizes lives in the database. Can’t easily just update a file
87+
- Not clear that the user actually wants a DML interface
88+
89+
# Rollout
90+
91+
## Testing
92+
93+
- environmentd test to see if the dyncfg reflects in the system catalog, similar to `src/environmentd/tests/bootstrap_builtin_clusters.rs`
94+
- Cloudtest that asserts live changes to the synced file reflect in the database
95+
96+
## Lifecycle
97+
98+
- Customers will roll this out using a new version of the operator as well as Materialize. This would involve a `helm upgrade` on the operator and a `kubectl apply` on the Materialize CR.
99+
100+
# Open questions
101+
102+
- What should the interface be in the Materialize CR? We have the following options:
103+
- A boolean that signals if we want to create the configmap
104+
- Allow defining cluster replica sizes in the Materialize CR
105+
- Risk here is we don’t want the Materialize CR definition file to be the source of truth
106+
- We currently sync the builtin table `mz_cluster_replica_sizes` with CLI arguments during bootstrap by updating the catalog in-memory with our new sizes then writing to our built in tables. This is an issue however since there’s no mechanism to sync the dyncfg to our builtin tables and it’s questionable if we really want to write to our builtin tables upon changes on a json string dyncfg. There are a few options:
107+
- Watch for changes to the dyncfg and write to builtin tables
108+
- De-normalize `mz_cluster_replica_sizes` into `mz_clusters`.
109+
- Document restarting envd when they update the dyncfg variable in order for changes to appear in the Console

0 commit comments

Comments
 (0)