-
Notifications
You must be signed in to change notification settings - Fork 14
7.1.0 r2 documentation #890
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
… placeholder/starting point.
docs/config-smart-download.md
Outdated
|
|
||
| ## Download Failover Resiliency | ||
|
|
||
| SSR images can be downloaded from a variety of sources, depending on software access mode (eg. internet-only, prefer-conductor, conductor-only, offline-mode): the HA peer, both conductor nodes, artifactory, and the mist proxy to artifactory (cloud deployments only). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should Mist be capitalized?
docs/config-smart-download.md
Outdated
|
|
||
| SSR images can be downloaded from a variety of sources, depending on software access mode (eg. internet-only, prefer-conductor, conductor-only, offline-mode): the HA peer, both conductor nodes, artifactory, and the mist proxy to artifactory (cloud deployments only). | ||
|
|
||
| To improve resiliency to network connectivity issues, the SSR queries available versions from all sources before beginning the download. It compiles a list of sources where the requested version is available and begins the download. If more than 50% of requests to a source fail within a window of 10 requests, the SSR marks that source unavailable and moves on to the next source. The following priority order is used for sources: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my mind the size of the window is more of an implementation detail and may be subject to change based on tuning. We may want to be less specific about that in case we decide to adjust it in the future. But this may be fine too. Not sure how likely we are to need to adjust it
docs/config-smart-download.md
Outdated
|
|
||
| In the event that all sources have reached the threshold of consecutive failures and a download attempt has returned an error, the SSR can be configured to wait for a specified amount of time and then retry the download. If a connection is successfully made, the download will resume where it left off. | ||
|
|
||
| When the timeout is enabled, the SSR waits for a configurable amount of time (default is 10800s) for the download to complete. When the timeout value is reached, the download is marked as **Failed** and the retry delay begins. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not quite accurate. The retry delay will begin once we have marked all download sources as unavailable, as described in the failover resilience section. If enabled, once this timeout is hit, the download will be entirely stopped and marked as a failure. Or in other words, the retries happen inside of this timeout, not after it.
docs/config-smart-download.md
Outdated
|
|
||
| ### Sequenced HA Download | ||
|
|
||
| The SSR supports sequenced downloading; one node of an HA pair downloads an image from the remote repository, and the other node waits for it to complete. Once that download is complete, the second node downloads it from the first. When targeting an HA router, the download is sequenced by default. To disable this sequencing, use `request system software download simultaneous disable`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once note about the second node downloads it from the first. The peer is the first place that an HA router will attempt to download from, so in most cases this would be the case, but if for whatever reason the connection to the peer went down, the router would move on and continue downloading from the conductor or remote sources. Not sure if that needs to be clarified or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the download happen over the HA sync connection or the HA fabric?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe it's the HA sync connection
docs/sec-conductor-onboard.md
Outdated
|
|
||
| ## Configuration | ||
|
|
||
| Three components: Onboarding conductor, router, Operational conductor. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a customer specific tpopology. We shoudn;t limit this doc to just this use case. The doc should only talk about the router and conductor.
docs/sec-conductor-onboard.md
Outdated
|
|
||
| The next step in the process is to generate an onboarding token from conductor Web interface, command line, or using APIs. The generated tokens are signed by the conductor’s private key so that they cannot be altered once generated. The SSR supports two modes; Authority Wide and Router Specific tokens. These are mutually exclusive and are defined in the configuration. | ||
|
|
||
| #### Authority-Wide Tokens |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This concept is removed from the FS and should be deleted from the doc. We will only support per router tokens.
…tation' into 7.1.0-r2-documentation
…o 7.1.0-r2-documentation
BenMatase
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feels like there is some duplicate information in sco doc
docs/sec-conductor-onboard.md
Outdated
|
|
||
| ### Prerequisites | ||
|
|
||
| - The `secure-conductor-onboarding mode` must be enabled |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes it sound like there is only at the authority level. We don't have a mode at the authority at this time
docs/sec-conductor-onboard.md
Outdated
|
|
||
| To provide a secure and mutually authenticated onboarding mechanism, the following information must be configured. | ||
|
|
||
| - Pre-shared key: The onboarding pre-shared key is a 48-character alpha-numeric string, configured at the authority or the router level. This key is mandatory for the SCO process. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not at the authority level for now
docs/sec-conductor-onboard.md
Outdated
|
|
||
| - Pre-shared key: The onboarding pre-shared key is a 48-character alpha-numeric string, configured at the authority or the router level. This key is mandatory for the SCO process. | ||
| - Conductor Public certificate: A public-private key certificate. | ||
| - Conductor CA certificate: Optionally, you can configure a public certificate signed by a preferred CA signing authority. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not optional
docs/sec-conductor-onboard.md
Outdated
| After the user generates an onboarding token, enter the token and other onboarding details in the onboarding UI or using CLI commands. There are two methods to onboard a router: | ||
| - Using the Command line: `secure-conductor-onboarding-token` command and `onboarding-config.json`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Using the Command line: `create secure-conductor-onboarding token` command and `onboarding-config.json`.
docs/sec-conductor-onboard.md
Outdated
| 4. The router connects to the conductor over port 930 using the SSH keys exchanged in previous steps. | ||
| 5. The router is prepped and initialized by the conductor. During this process, the system goes through the reboot cycle. | ||
| Once the secure SSH tunnels are established, the SCO workflow concludes. All future communication between the router and conductor will occur on standard SSR to conductor ports such as 930, 4505, 4506, etc. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If SCO happens, won't use 4505/4506 from that point on. Everything is over 930
docs/sec-conductor-onboard.md
Outdated
| `configure authority router secure-conductor-onboarding pre-shared-secret` | ||
| The pre-shared secret is a 48-character alpha-numeric string. When enabled, any empty PSK will auto generate a random 48-byte alphanumeric string using the FIPS-approved, highly secure DRBG function from OpenSSL. Once generated, the key does not automatically change. It can be updated by the user if necessary. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not complete yet
docs/sec-conductor-onboard.md
Outdated
| ### Token Contents | ||
| The next step in the process is to generate an onboarding token from the conductor Web interface, command line, or using APIs. The generated tokens are signed by the conductor’s private key so that they cannot be altered once generated. The SSR supports two modes; Authority-wide and Router-specific tokens. These are mutually exclusive and are defined in the configuration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this doc needs to be scrubbed of "authority wide" tokens for now
docs/sec-conductor-onboard.md
Outdated
| The following parameters are required, and are configured at the Router level. | ||
| `configure authority router secure-conductor-onboarding mode` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might not match the func spec exactly, but the router level path is at configure authority router system secure-conductor-onboarding. This applies to the other paths in the doc
docs/config-smart-download.md
Outdated
| ### Auto-resume Download on WAN Failures | ||
|
|
||
| In the event that all sources have reached the threshold of consecutive failures and a download attempt has failed, the SSR can be configured to wait for a specified amount of time and then retry the download. If a connection is successfully made, the download will resume where it left off. | ||
| In the event that all sources have reached the threshold of consecutive failures and a download attempt has returned an error, the SSR can be configured to wait for a specified amount of time and then retry the download. If a connection is successfully made, the download will resume where it left off. Use the `software-update download enable-timeout` command to enable the retry feature. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The enable-timeout field is separate from retries. The only thing it enables is the timeout described in the next paragraph, and retries will happen regardless of whether the timeout is enabled
docs/config-smart-download.md
Outdated
| In the event that all sources have reached the threshold of consecutive failures and a download attempt has returned an error, the SSR can be configured to wait for a specified amount of time and then retry the download. If a connection is successfully made, the download will resume where it left off. Use the `software-update download enable-timeout` command to enable the retry feature. | ||
|
|
||
| When the timeout is enabled, the SSR waits for a configurable amount of time (default is 10800s) for the download to complete. When the timeout value is reached, the download is marked as **Failed** and the retry delay begins. | ||
| When the timeout is enabled (software-update download enable-timeout true) the SSR will wait for a configurable amount of time (default is 10800s) for the download to complete. If the timeout value is reached without successfully downloading the software, the download is marked as "Failed". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it worth noting that the timeout is enabled by default?
docs/config-smart-download.md
Outdated
| The retry delay time is the longest time to wait between retry attempts. For example, the initial retry delay starts at 30 seconds. With each failure the delay is increased exponentially. However, when that calculated value reaches the maximum retry delay time, successive wait times for additional attempts do not exceed the maximium retry delay time. The default is 3600 seconds. A maximum number of times to retry can also be configured. | ||
|
|
||
| The retry timeout can be disabled. If it is disabled, the download will retry indefinitely. | ||
| If the retry timeout is disabled, the download will retry indefinitely |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned above, the timeout is a separate mechanism from the retries, so I wouldn't necessarily describe it as a retry timeout. And the download would only retry indefinitely if both the timeout is disabled and the attempts is configured to 0.
docs/config-smart-download.md
Outdated
|
|
||
| ### Sequenced HA Download | ||
|
|
||
| The SSR supports sequenced downloading; one node of an HA pair downloads an image from the remote repository, and the other node waits for it to complete. Once that download is complete, the second node downloads it from the first. When targeting an HA router, the download is sequenced by default. To disable this sequencing, use `request system software download simultaneous disable`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update: I ended up making the download unsequenced by default. I may change that in the future, but in the beta we're giving Swift, it will be unsequenced.
In order to do a sequenced download, you would use request system software download router RouterName version SSR-X.Y.Z sequenced
docs/sec-conductor-onboard.md
Outdated
| After the user generates an onboarding token, enter the token and other onboarding details in the onboarding UI or using CLI commands. There are two methods to onboard a router: | ||
| - Using the Command line: `create secure-conductor-onboarding-token` command and `onboarding-config.json`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The command still needs to be fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is wrong with it? I copied your command from the earlier review. Am I missing something?
docs/sec-conductor-onboard.md
Outdated
| To enable this feature on the conductor, verify the following: | ||
| - The `secure conductor onboarding mode` should not be disabled (see above). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line should be removed. The conductor/whole authority doesn't have a mode
docs/sec-conductor-onboard.md
Outdated
| The CA certificate is read from disk at the location given in `secure-conductor-onboarding ca-certificate`. | ||
| ## Token Management |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This section is a dup of the Token Creation section and can be removed
docs/config-smart-download.md
Outdated
| In the event that all sources have reached the threshold of consecutive failures and a download attempt has returned an error, the SSR can be configured to wait for a specified amount of time and then retry the download. If a connection is successfully made, the download will resume where it left off. | ||
|
|
||
| When the timeout is enabled (software-update download enable-timeout true) the SSR will wait for a configurable amount of time (default is 10800s) for the download to complete. If the timeout value is reached without successfully downloading the software, the download is marked as "Failed". | ||
| The timeout is enabled by default (`software-update download enable-timeout true`). The SSR waits for a configurable amount of time (default is 10800s) for the download to complete. If the timeout value is reached without successfully downloading the software, the download is marked as "Failed". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is accurate, but something I hadn't thought of when reviewing before is that the retry configuration in the paragraph below is probably more significant than the timeout configuration, so I might swap the two paragraphs.
docs/config-smart-download.md
Outdated
| When the timeout is enabled (software-update download enable-timeout true) the SSR will wait for a configurable amount of time (default is 10800s) for the download to complete. If the timeout value is reached without successfully downloading the software, the download is marked as "Failed". | ||
| The timeout is enabled by default (`software-update download enable-timeout true`). The SSR waits for a configurable amount of time (default is 10800s) for the download to complete. If the timeout value is reached without successfully downloading the software, the download is marked as "Failed". | ||
|
|
||
| The retry delay time is the longest time to wait between retry attempts. For example, the initial retry delay starts at 30 seconds. With each failure the delay is increased exponentially. However, when that calculated value reaches the maximum retry delay time, successive wait times for additional attempts do not exceed the maximium retry delay time. The default is 3600 seconds. A maximum number of times to retry can also be configured. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo in maximium
docs/config-smart-download.md
Outdated
|
|
||
| If the retry timeout is disabled, the download will retry indefinitely | ||
|
|
||
| Use the command `configure authority router system software-update download enable-timeout [enabled]` to enable auto-resume. The command parameters are listed below: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The enable-timeout field doesn't really enable auto-resume. It's just a way you can tune the behavior to meet your needs. Maybe something along the lines of this would be more accurate?
Use the command
configure authority router system software-update downloadto adjust the download retry behavior. The command parameters are listed below:
docs/config-smart-download.md
Outdated
| - `enable-timeout`: True/false, default is true. This enables a time limit for the overall download. | ||
| - `timeout`: Amount of time in seconds that the SSR waits for the software download to complete. When the timeout value is reached the download is marked as **Failed**, and the retry delay begins. The default download wait time is 10800s. Range is 1800s - 604800s. | ||
| - `attempts`: The maximum number of attempts to download before considering the download as failed. If set to 0, the SSR will retry the download until the timeout is hit. Default is 10. | ||
| - `max retry delay`: The maximum amount of time in seconds to wait in between retry attempts. The retry delay will start off low and back off exponentially up to this duration. Range is 0 to 86400s. Default is 3600s. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maximum-retry-delay
docs/concepts-config-integrity.md
Outdated
| 4. fscrypt uses the FEK to automatically unlock the necessary encrypted directories. | ||
|
|
||
| This systemd service handles the subsequent boots of the SSR after Configuration Integrity has been enabled. It runs a series of integrity checks, and identifies when the system is ready to continue operation after successful unlocking of the encrypted directories. When it is run, it performs the following sequence: | ||
| If any of these steps fail, it is interpreted as an integrity event. Network activities are blocked. An emergency log is generated and broadcast to all consoles on the system that the system integrity is compromised and it must be reprovisioned. The SSR will repeatedly try to start the integrity service to unlock the encrypted directories and fail, each time writing the emergency log. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extra space after "system"
…gy/docs into 7.1.0-r2-documentation
|
|
||
| - The `secure-conductor-onboarding` must be enabled | ||
| - The `secure-conductor-onboarding public-key` field must be configured | ||
| - The `secure-conductor-onboarding ca-certificate` field must be configured |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have added config time validation that the conductor nodes must also have asset ids configured if SCO is enabled. Not sure if we want to call it out here.
docs/sec-conductor-onboard.md
Outdated
| `configure authority router system secure-conductor-onboarding mode` | ||
| - `disabled`: Default is true, must be false to enable. | ||
| - `psk-only`: Configured on devices with no TPM, but which require the Secure Conductor Onboarding workflow. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
psk-only has been removed as an option. Now weak will generate a self signed cert per authentication attempt for non-TPM devices.
docs/sec-conductor-onboard.md
Outdated
| To read the EK from the public cloud instance, run `tpm2_readpublic -c 0x81010001 -f DER -o /dev/stdout -Q | base64 -w0` and configure the contents in the endorsement-key field above. | ||
| ::: | ||
| - Disable salt state on conductor. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This section can be removed. 4505 and 4506 will now be automatically closed after the SCO is enabled on the conductor and the conductors are restarted.
docs/sec-conductor-onboard.md
Outdated
| - `weak`: This setting enables SCO but allows the router to use a self-signed certificate. This conductor will skip the CA certificate validation for this router. | ||
| - `strong`: On SSR devices manufactured with a device ID (SSR400/SSR440), `strong` mode ensures that the asset-id matches the serial number field in the subject line of the router’s public certificate. For vTPM workflows, the router’s endorsement key must match the `endorsement-key` configuration. | ||
| `configure authority router system secure-conductor-onboarding pre-shared-secret` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This parameter is no longer required. It will be auto generated if not specified
| - For devices with a built-in dev-id certificate | ||
| ``` | ||
| config authority router router1 system secure-conductor-onboarding mode strong | ||
| config authority router router1 system secure-conductor-onboarding pre-shared-secret (removed) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to call out that this is now optional?
| - For Public cloud VMs with vTPM | ||
| ``` | ||
| config authority router router1 system secure-conductor-onboarding mode strong | ||
| config authority router router1 system secure-conductor-onboarding pre-shared-secret (removed) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, this config is now optional
docs/sec-conductor-onboard.md
Outdated
| ### Known Caveats | ||
| - During SCO onboarding of the router in an HA deployment, both the conductor nodes should be online and able to talk to each other. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is no longer a caveat
docs/sec-conductor-onboard.md
Outdated
| exit | ||
| ``` | ||
| If any checks fail, the `create system connectivity` command returns an error explaining why. This command can be run as many times as needed for each node. All information to form the token is present in the configuration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| If any checks fail, the `create system connectivity` command returns an error explaining why. This command can be run as many times as needed for each node. All information to form the token is present in the configuration. | |
| If any checks fail, the `create secure-conductor-onboarding token` command returns an error explaining why. This command can be run as many times as needed for each router. All information to form the token is present in the configuration. |
docs/sec-conductor-onboard.md
Outdated
| - Enable ssh-only for asset resiliency. | ||
| `configure authority asset-connection-resiliency ssh-only true ` | ||
| - Enable SCO for each router. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| - Enable SCO for each router. | |
| - Enable SCO for each router config on the conductor. |
Don't love my wording, but I had someone confuse this as this config needs to be applied on the router itself. How can we make that more clear that this config is applied on the conductor?
docs/sec-conductor-onboard.md
Outdated
| :::note | ||
| In the current beta delivery (7.1.3-1r2) this step must be performed to disable ports 4505 and 4506 so any devices not using this feature will fail to onboard to the conductor. | ||
| Ports 4505 and 4506 are automatically closed after SCO is enabled on the conductor and the conductor is restarted. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like this implies that the conductor is automatically restarted. Maybe
Ports 4505 and 4506 are automatically closed after SCO is enabled on the conductor once a user restarts the conductors.
not sure
docs/sec-conductor-onboard.md
Outdated
| `configure authority router system secure-conductor-onboarding mode` | ||
| - `disabled`: Default is true, must be false to enable. | ||
| - `psk-only`: Configured on devices with no TPM, but which require the Secure Conductor Onboarding workflow. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line should be removed
…tation' of github.com:128technology/docs into 7.1.0-r2-documentation
No description provided.