feat(firewall): prepare cloud engines' firewalls for running ic-gateway as a side-car to the replica#10500
Conversation
97466e1 to
f05b111
Compare
f05b111 to
032d24a
Compare
ic-gatewayic-gateway as a side-car to the replica
There was a problem hiding this comment.
Pull request overview
This PR extends the registry + orchestrator firewall machinery to support a dedicated firewall configuration for cloud engine nodes, preparing them to run ic-gateway alongside the replica (including opening port 80 on assigned cloud engine nodes). It introduces a new registry firewall rules scope (CloudEngines) and wires cloud-engine-specific nftables templates/config into orchestrator role detection and config compilation.
Changes:
- Add
CloudEnginesas a first-classFirewallRulesScopeacross registry keys, candid interface, invariants, and admin tooling. - Teach the orchestrator firewall to detect assigned cloud engine nodes via
SubnetTypeand apply a cloud-engine-specific firewall template/config. - Update nftables golden outputs and
ic.json5template defaults (including additional HTTP adapter blacklist ports).
Reviewed changes
Copilot reviewed 19 out of 19 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| rs/registry/keys/src/lib.rs | Adds CloudEngines scope to FirewallRulesScope parsing/formatting + tests. |
| rs/registry/helpers/src/node.rs | Extends node→subnet lookup helper to also return SubnetType. |
| rs/registry/canister/src/mutations/firewall.rs | Ensures firewall mutation tests cover the new CloudEngines scope. |
| rs/registry/canister/src/invariants/firewall.rs | Adds invariant validation for CloudEngines firewall ruleset. |
| rs/registry/canister/canister/registry.did | Extends candid FirewallRulesScope with CloudEngines. |
| rs/registry/canister/canister/registry_test.did | Mirrors candid scope change for tests. |
| rs/registry/admin/bin/main.rs | Updates CLI help text to document cloud_engines scope. |
| rs/orchestrator/testdata/nftables_unassigned_replica.conf.golden | Updates HTTP adapter blacklist ports in golden output. |
| rs/orchestrator/testdata/nftables_unassigned_cloud_engine.conf.golden | Updates HTTP adapter blacklist ports in golden output. |
| rs/orchestrator/testdata/nftables_assigned_replica.conf.golden | Updates HTTP adapter blacklist ports in golden output. |
| rs/orchestrator/testdata/nftables_assigned_cloud_engine.conf.golden | Updates golden output for new scope + port 80 + blacklist changes. |
| rs/orchestrator/src/upgrade.rs | Refactors subnet-id derivation helper to use RegistryHelper. |
| rs/orchestrator/src/registry_helper.rs | Adds get_subnet_id_and_type_from_node_id and keeps subnet-id-only wrapper. |
| rs/orchestrator/src/orchestrator.rs | Wires cloud engine firewall config into Firewall::new. |
| rs/orchestrator/src/firewall.rs | Adds cloud engine role + config selection + CloudEngines scope fetching and tests. |
| rs/ic_os/config/tool/templates/ic.json5.template | Adds cloud_engine_firewall config block + port 80 accept + blacklist updates. |
| rs/config/src/firewall.rs | Introduces CloudEngineConfig for firewall templating/serialization. |
| rs/config/src/config.rs | Adds cloud_engine_firewall to main config + defaults/loading behavior. |
| rs/config/src/config_sample.rs | Extends sample config with cloud_engine_firewall section. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
This pull request changes code owned by the Governance team. Therefore, make sure that
you have considered the following (for Governance-owned code):
-
Update
unreleased_changelog.md(if there are behavior changes, even if they are
non-breaking). -
Are there BREAKING changes?
-
Is a data migration needed?
-
Security review?
How to Satisfy This Automatic Review
-
Go to the bottom of the pull request page.
-
Look for where it says this bot is requesting changes.
-
Click the three dots to the right.
-
Select "Dismiss review".
-
In the text entry box, respond to each of the numbered items in the previous
section, declare one of the following:
-
Done.
-
$REASON_WHY_NO_NEED. E.g. for
unreleased_changelog.md, "No
canister behavior changes.", or for item 2, "Existing APIs
behave as before.".
Brief Guide to "Externally Visible" Changes
"Externally visible behavior change" is very often due to some NEW canister API.
Changes to EXISTING APIs are more likely to be "breaking".
If these changes are breaking, make sure that clients know how to migrate, how to
maintain their continuity of operations.
If your changes are behind a feature flag, then, do NOT add entrie(s) to
unreleased_changelog.md in this PR! But rather, add entrie(s) later, in the PR
that enables these changes in production.
Reference(s)
For a more comprehensive checklist, see here.
GOVERNANCE_CHECKLIST_REMINDER_DEDUP
|
✅ No security or compliance issues detected. Reviewed everything up to 440bc0c. Security Overview
Detected Code Changes| Change Type | Relevant files ... (code changes summary truncated to fit VCS comment limits.) |
|
✅ No security or compliance issues detected. Reviewed everything up to 440bc0c. Security Overview
Detected Code Changes
|
|
✅ No security or compliance issues detected. Reviewed everything up to 440bc0c. Security Overview
Detected Code Changes
|
- Done.
- No breaking changes. Just adding a new scope.
- No data migration needed.
- No security review needed.
daniel-wong-dfinity-org-twin
left a comment
There was a problem hiding this comment.
For Governance-owned code, this seems more or less harmless (and I assume, adds a new feature). I see some test code, but am not sure about coverage, but not being familiar with firewall (since it is not an "architectural" level aspect of Registry, but rather a "subdomain" issue), I assume it's sufficient -> approve.
| chain OUTPUT {\n\ | ||
| type filter hook output priority 0; policy accept;\n\ | ||
| meta skuid ic-http-adapter fib daddr type local ct state { new } tcp dport { 1-19999 } reject # Block restricted local addresses ic-http-adapter HTTPS access\n\ | ||
| meta skuid ic-http-adapter ip6 daddr { 2a00:fb01:400:42::/64, 2602:fb2b:110::/48, 2602:fb2b:100::/48, 2602:fb2b:120::/48 } ct state { new } tcp dport { 1-19999 } reject # Block restricted outbound ic-http-adapter HTTPS access\n\ |
There was a problem hiding this comment.
Actually, here we should also add the prefix of tp1: 2600:3007:4401::/48
| # DHCPv6\n\ | ||
| udp dport { 546 } accept\n\ | ||
| # TCP ports required for GuestOS functionality\n\ | ||
| ip6 saddr { {{ ipv6_prefix }} } ct state { new } tcp dport { 7070, 9090, 9091, 9100, 9314, 19531, 19100, 19522 } accept\n\ |
There was a problem hiding this comment.
Did you omit 9324 on purpose? This rule allows the node provider to scrape metrics/logs from the same prefix as the one within that node runs.
| chain OUTPUT {\n\ | ||
| type filter hook output priority 0; policy accept;\n\ | ||
| meta skuid ic-http-adapter fib daddr type local ct state { new } tcp dport { 1-19999 } reject # Block restricted local addresses ic-http-adapter HTTPS access\n\ | ||
| meta skuid ic-http-adapter ip6 daddr { 2a00:fb01:400:42::/64, 2602:fb2b:110::/48, 2602:fb2b:100::/48, 2602:fb2b:120::/48 } ct state { new } tcp dport { 1-19999 } reject # Block restricted outbound ic-http-adapter HTTPS access\n\ |
There was a problem hiding this comment.
Here as well, let's add the tp1 prefix: 2600:3007:4401::/48
To make cloud engines self-contained, this PR prepares each of their nodes' firewalls to support running
ic-gatewaynext to the replica, with the former routing external traffic to the latter.Cloud engines now have their own firewall config in
ic.json5. The only differences with a normal replica is that it additionally opens port 80 (foric-gateway, can be later changed to 443 when supporting TLS). Port 9314 (foric-gateway's metrics) can be open through a proposal, which also justifies the newly-addedCloudEnginesfirewall scope.You can visualize the introduced changes by looking at
nftables_assigned_cloud_engine.conf.golden. These apply only to assigned cloud engine nodes, i.e. based on their subnet type and not their node reward type (see hownftables_unassigned_cloud_engine.conf.goldenis unchanged). Indeed,ic-gatewaywill run only when nodes are assigned so it makes sense to open the firewall only then.Naturally, we also include port 9314 to the HTTP adapter blacklist. While we're at it, we also include ports 9324 (
ic-boundary's metrics) and port 19522 (disk encryption's key exchange server during upgrades) to the list, which were likely forgotten by mistake.The motivation for using a separate template than the replica's is that it will diverge even more in the future when adding
ic-boundaryand will include parts of theboundary_node_firewall. The cloud engine's firewall will be mid-way between the replica's and the API BN's so they might as well have their own template.