-
Notifications
You must be signed in to change notification settings - Fork 134
Description
We need a resource representing abstractly a workload executing on a compute resource (namely a BareMetalHost):
- It MUST describe the workload completely:
- specification of an OS image
- specification of an initial configuration. It SHOULD support all the supported initial configuration format (cloud-init, ignition).
- online status,
- It MUST abstract away the identity of the BareMetalHost, a user should be able to describe a workload to execute and a set of requirements on the resource executing it. This is the mechanism of host selectors exposed at the level of Metal3Machine but made independent of it.
- It MUST be abstract enough so that we can target with appropriate controllers other compute resources than BareMetalHost that provide a similar API: typically virtual machines on private or public clouds that can execute arbitrary OS configured with either cloud-init or ignition on first boot.
The resource MUST be usable in place of BareMetalHost as the associated target of a Metal3Machine in the cluster api provider metal3. It SHOULD behave as a BareMetalHost and MUST be transparent for at least the following features:
- data templates,
- in place updates,
- metal3 remediation.
It MUST support pivoting but may change its semantics. There MUST be a way to point to a compute resource in another cluster if the resource has the right credentials to do so.
Supported use cases
Description of workloads to execute directly on a bare metal server
We want to execute some simple services directly on bare-metal. We may not want to specify exactly which machine to use as long as it fulfills a set of requirements to have a better utilization of the hardware resources.
Multi-tenancy for CAPM3
We want to share a set of BareMetalHost between several clusters belonging to different users. Each user should have a namespace for his cluster. The user must be able to use BareMetalHosts without taking the full control of the hardware. He must never get access to the BMC credentials but he must have a sufficient view on the server he uses to configure his cluster (mainly fill the data templates). When a node is stopped in a cluster, the underlying bare-metal server must be usable by another host.
Hybrid clusters
We want to create clusters with Cluster Api with nodes hosted on different kind of compute resources (servers, VM in public or private clouds). Today the known are either:
- centered around light control planes and one kind of workers (Kamaji),
- complex and incomplete for day 1 operations (Bring Your Own Host),
- hacks (use of several clusters object with only one implementing the control plane as presented in https://metal3.io/blog/2022/07/08/One_cluster_multiple_providers.html)
The Metal3 cluster api provider is a complete solution relatively abstract from the underlying compute resource with a lot of tooling (data templates, strong IPAM integration, notion of remediation, support for in place update).
If there are multiple controllers linking the workload resource with different kind of compute resource in the same way as persistent volume claim can target different storage classes.
Pivot Semantics
Regarding BareMetalHosts, if we want to support multi-tenancy we cannot pivot them as this would give full control of the hardware (BMC credentials) to the customer. It would also make it impossible to reuse the servers on the initial cluster when the pivoted cluster is scaled down.
So we want to only pivot the workload resource. It means that it will point to servers on another cluster.
From the user point of view, this may mean a decrease in dependability because the BareMetalHost controller and Ironic are still hosted on the initial cluster. If the link between the initial cluster and the pivoted cluster is severed, the pivoted cluster will not be able to update the state of its underlying servers.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status