-
Notifications
You must be signed in to change notification settings - Fork 202
packrat ereport storage and snitch implementation #2126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
@@ -0,0 +1,31 @@ | |||
# Gimletlet Ereport test application |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't feel strongly about this, but I'd be fine having this be part of the basic gimletlet
image, since it's additive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. I was thinking we might not always want it to save on flash, but that seems like a problem that could be solved when it actually becomes a problem!
// We *could* include the Hubris image ID here even if our VPD | ||
// identity hasn't been set, but sending an empty metadata map | ||
// ensures that MGS will continue asking for metadata on subsequent | ||
// requests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Protocol question: what ensures that MGS keeps asking for metadata?
MGS always gets the new restart ID, so subsequent requests from MGS won't take this conditional (unless MGS deliberately avoids updating its restart ID if there's no metadata?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When MGS receives a request to read ereports from a SP, if its metadata map for that SP is empty, it will send a "metadata refresh request" with restart ID 0 and a limit of 0 ereports, before it forwards the request it received:
https://github.com/oxidecomputer/management-gateway-service/blob/77e316c812aa057b9714d0d99c4a7bdd36d45be2/gateway-sp-comms/src/ereport.rs#L109-L150
MGS doesn't actually store the restart ID, that's stored by whoever is making requests to MGS. It only updates its metadata map if it receives a non-empty metadata map from the SP, or if the request's restart ID doesn't match the SP's:
Looking at the code again, I think the Option
around the metadata map complicates things and it really should be checking if the actual map is empty...I think there might be a bug where we don't re-request metadata if the SP restarts and then sends empty meta. I'll go fix that.
ByteGather(r.slices.0, r.slices.1), | ||
); | ||
let mut c = | ||
minicbor::encode::write::Cursor::new(&mut self.recv[..]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we implement minicbor::write::Write
on a (wrapper type around a) lease, to avoid having to copy data twice? It would be a decent amount of plumbing, so maybe not required in this PR...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's a good idea. we should probably do that in a lib crate, because tasks that want to send ereports to packrat will probably also want to write into a lease.
however, there is another reason for the double-buffering in this instance, which is that we're currently using it to determine how long the ereport is before copying it into the lease, so that we can put it in the lease only if it fits. since what we're doing here is "write as many ereports as fit", we don't want to error out if we run out of space when encoding. we could probably handle that by hanging onto the previous position we had written to, and not advancing it unless we wrote the whole thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could probably handle that by hanging onto the previous position we had written to, and not advancing it unless we wrote the whole thing.
Yeah, I was thinking along those lines. We'd need a wrapper type and a new error type which distinguishes between "lease write failed" and "ran out of space", so I'm fine punting for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, i started on this, but it's just complicated enough i'd kinda prefer to do it in a follow-up branch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did this in #2164, there was a bit of jank due to the minicbor
API but I think it was worth it.
Presently, the `ereport::Worker` struct [stores the metadata map in an `Option`][1]. Metadata refresh requests (`restart_id=0, start_ena=0, limit=0`) are [sent to the SP if the `Option` is `None`][2]. The option [is set to `Some`][3] if we receive a packet from the SP where the metadata map is non-empty, or if the restart ID mismatches the requested one. If I recall correctly, the `Option` was intended to distinguish between "we just started up" and "we received an explicit empty metadata map". But, I don't actually think we _should_ be distinguishing between those cases. When the SP has restarted and given us an empty metadata map, this may be because we requested ereports from `packrat` _before_ VPD has been loaded (as I discussed in oxidecomputer/hubris#2126 (comment)). In that case, when the SP sends us an empty metadata map, we want to keep requesting the metadata on every subsequent request, as it might be set later. Thus, this commit just removes the `Option` and has it start out with an _empty_ map, and overwrites the existing map if the restart IDs are mismatched, *or* any time the current map is empty and the received one is non-empty. I've also added an additional test for this behavior. Fioxes #409 [1]: https://github.com/oxidecomputer/management-gateway-service/blob/77e316c812aa057b9714d0d99c4a7bdd36d45be2/gateway-sp-comms/src/ereport.rs#L79-L83 [2]: https://github.com/oxidecomputer/management-gateway-service/blob/77e316c812aa057b9714d0d99c4a7bdd36d45be2/gateway-sp-comms/src/ereport.rs#L109-L111 [3]: https://github.com/oxidecomputer/management-gateway-service/blob/77e316c812aa057b9714d0d99c4a7bdd36d45be2/gateway-sp-comms/src/ereport.rs#L351-L360
Presently, the `ereport::Worker` struct [stores the metadata map in an `Option`][1]. Metadata refresh requests (`restart_id=0, start_ena=0, limit=0`) are [sent to the SP if the `Option` is `None`][2]. The option [is set to `Some`][3] if we receive a packet from the SP where the metadata map is non-empty, or if the restart ID mismatches the requested one. If I recall correctly, the `Option` was intended to distinguish between "we just started up" and "we received an explicit empty metadata map". But, I don't actually think we _should_ be distinguishing between those cases. When the SP has restarted and given us an empty metadata map, this may be because we requested ereports from `packrat` _before_ VPD has been loaded (as I discussed in oxidecomputer/hubris#2126 (comment)). In that case, when the SP sends us an empty metadata map, we want to keep requesting the metadata on every subsequent request, as it might be set later. Thus, this commit just removes the `Option` and has it start out with an _empty_ map, and overwrites the existing map if the restart IDs are mismatched, *or* any time the current map is empty and the received one is non-empty. I've also added an additional test for this behavior. Fixes #409 [1]: https://github.com/oxidecomputer/management-gateway-service/blob/77e316c812aa057b9714d0d99c4a7bdd36d45be2/gateway-sp-comms/src/ereport.rs#L79-L83 [2]: https://github.com/oxidecomputer/management-gateway-service/blob/77e316c812aa057b9714d0d99c4a7bdd36d45be2/gateway-sp-comms/src/ereport.rs#L109-L111 [3]: https://github.com/oxidecomputer/management-gateway-service/blob/77e316c812aa057b9714d0d99c4a7bdd36d45be2/gateway-sp-comms/src/ereport.rs#L351-L360
Things fail.
Not finding out about it sucks.
This branch implements the Hubris side of the ereport ingestion system,
as described RFD 545. Work on this was started by @cbiffle in #2002,
which implemented the core ring-buffer data structure used to store
ereports. Meanwhile, oxidecomputer/management-gateway-service#370,
oxidecomputer/omicron#7803, and oxidecomputer/omicron#8296 added
the MGS and Omicron components of this system.
This branch picks up where Cliff left off, and "draws the rest of the
owl" by implementing the aggregation of ereports in the
packrat
taskusing this data structure, and adding a new
snitch
task, which acts asa proxy to allow ereports stored by
packrat
to be read over themanagement network.
Architecture
Ereports are stored by
packrat
because we would like as many tasks aspossible to be able to report errors by making IPC call to the task
responsible for ereport storage. This means that the task aggregating
ereports must be a high-priority task, so that as many other tasks as
possible may be its clients. Additionally, we would like to include the
system's VPD identity as metadata for ereports, and this data is already
stored by packrat. Finally, we would like to minimize the likelihood of
the task that stores ereports crashing, as this would result in data
loss, and packrat already is expected not to crash.
On the other hand, the task that actually evacuates these ereports over
the management network must run at a priority lower than that of the
net
task, of which it is a client. Thus the separation ofresponsibilities between
packrat
and thesnitch
. The snitch task isfairly simple. It receives packets sent to the ereport socket,
interprets the request message, and forwards the request to packrat. Any
ereports sent back by packrat are sent in response to the request. The
snitch ends up being a pretty dumb, stateless proxy: as the response
packet is encoded by packrat; all we end up doing is taking the bytes
received from packrat and stuffing them into the socket's send queue.
The real purpose of this thing is just to serve as a trampoline between
the high priority level of packrat and a priority level lower than that
of the net task.
snitch-core
FixesWhile testing behavior when the ereport buffer is full, I found a
potential panic in the existing
snitch-core
code. Previously, everytime ereports are read from the buffer while it is in the
Losing
state(i.e., ereports have been discarded because the buffer was full),
snitch-core
attempts to insert a new loss record at the end of thebuffer (calling
recover_if_needed()
). This ensures that the data lossis reported to the reader ASAP. The problem is that this code assumed
that there would always be space for an additional loss record, and
panicked if it didn't fit. I added a test reproducing this panic in
ff93754, and fixed it in
22044d1 by changing the calculation of
whether recovery is possible.
When
recover_if_needed
is called while in theLosing
state, we callthe
free_space()
method to determine whether we can recover. In theLosing
state, this method would calculate the free space bysubtracting the space required for the loss record that must be
encoded to transition out of the
Losing
state. However, in the casewhere
recover_if_required()
is called withrequired_space: None
(which indicates that we're not trying to recover because we want to
insert a new record, but just because we want to report ongoing data
loss to the caller), we check that the free space is greater than or
equal to 0. This means that we would still try to insert a loss
record even if the free space was 0, resulting in a panic. I've fixed
this by moving the check that there's space for a loss record out of the
calculation of
free_space()
and into the required space, in additionto the requested value (which is 0 in the "we are inserting the loss
record to report loss" case). This way, we only insert the loss record
if it fits, which is the correct behavior.
I've also changed the assignment of ENAs in
snitch-core
to start at 1,rather than 0, since ENA 0 is reserved in the wire protocol to indicate
"no ENA". In the "committed ENA" request field this means "don't flush
any ereports", and in the "start ENA" response field, ENA 0 means "no
ereports in this packet". Thus, the ereport store must start assigning
ENAs at ENA 1 for the initial loss record.
Testing
Currently, no tasks actually produce ereports. To test that everything
works correctly, it was necessary to add a source of ereports, so I've
added a little task that just generates test ereports when asked
via
hiffy
. I've included some of that in [this comment][4]. This wasalso used for testing the data-loss behavior discussed above.