-
Notifications
You must be signed in to change notification settings - Fork 245
Stream patches to the front end from edda; try to reduce nats write load; priority queue for MV builds #6410
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Dependency Review✅ No vulnerabilities or OpenSSF Scorecard issues found.Scanned FilesNone |
An example of one of the streamed patches on the websocket: {"workspaceId":"01GWAJW3PQ0S1DVBN76XH9DBHV","changeSetId":"01JXN8KGNG2CK01ETFH33VBNJE","kind":"Component","id":"01JXN8MRMF6AZ50CTMXPKT741Y","fromChecksum":"616b523fb65078f0063ad308e6e7cf1b","toChecksum":"214531f044b7b66cc11976eb8fea6847","patch":[{"op":"replace","path":"/resourceDiff/current","value":"{\n \"si\": {\n \"name\": \"StackFleetAssociation\",\n \"type\": \"component\",\n \"color\": \"#FF9900\"\n },\n \"domain\": {\n \"extra\": {\n \"Region\": \"us-east-1\",\n \"AwsResourceType\": \"AWS::AppStream::StackFleetAssociation\",\n \"PropUsageMap\": \"{\\\"createOnly\\\":[],\\\"updatable\\\":[\\\"StackName\\\",\\\"FleetName\\\"],\\\"secrets\\\":[]}\"\n }\n }\n}"},{"op":"replace","path":"/resourceDiff/diff","value":"+{\n+ \"si\": {\n+ \"name\": \"StackFleetAssociation\",\n+ \"type\": \"component\",\n+ \"color\": \"#FF9900\"\n+ },\n+ \"domain\": {\n+ \"extra\": {\n+ \"Region\": \"us-east-1\",\n+ \"AwsResourceType\": \"AWS::AppStream::StackFleetAssociation\",\n+ \"PropUsageMap\": \"{\\\"createOnly\\\":[],\\\"updatable\\\":[\\\"StackName\\\",\\\"FleetName\\\"],\\\"secrets\\\":[]}\"\n+ }\n+ }\n+}"}],"messageKind":"StreamingPatch"} Human readable-ish version: {
"workspaceId": "01GWAJW3PQ0S1DVBN76XH9DBHV",
"changeSetId": "01JXN8KGNG2CK01ETFH33VBNJE",
"kind": "Component",
"id": "01JXN8MRMF6AZ50CTMXPKT741Y",
"fromChecksum": "616b523fb65078f0063ad308e6e7cf1b",
"toChecksum": "214531f044b7b66cc11976eb8fea6847",
"patch": [
{
"op": "replace",
"path": "/resourceDiff/current",
"value": "{\n \"si\": {\n \"name\": \"StackFleetAssociation\",\n \"type\": \"component\",\n \"color\": \"#FF9900\"\n },\n \"domain\": {\n \"extra\": {\n \"Region\": \"us-east-1\",\n \"AwsResourceType\": \"AWS::AppStream::StackFleetAssociation\",\n \"PropUsageMap\": \"{\\\"createOnly\\\":[],\\\"updatable\\\":[\\\"StackName\\\",\\\"FleetName\\\"],\\\"secrets\\\":[]}\"\n }\n }\n}"
},
{
"op": "replace",
"path": "/resourceDiff/diff",
"value": "+{\n+ \"si\": {\n+ \"name\": \"StackFleetAssociation\",\n+ \"type\": \"component\",\n+ \"color\": \"#FF9900\"\n+ },\n+ \"domain\": {\n+ \"extra\": {\n+ \"Region\": \"us-east-1\",\n+ \"AwsResourceType\": \"AWS::AppStream::StackFleetAssociation\",\n+ \"PropUsageMap\": \"{\\\"createOnly\\\":[],\\\"updatable\\\":[\\\"StackName\\\",\\\"FleetName\\\"],\\\"secrets\\\":[]}\"\n+ }\n+ }\n+}"
}
],
"messageKind": "StreamingPatch"
}
|
@jhelwig thank you for the example payload. Is that |
The |
The "placeholder"/"staging" index checksum should probably be the change set id since we mostly treat the checksum as an opaque string, and it will never overlap with a "real" checksum. |
ef6faed
to
943d8c5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like your approach to send both the incremental and the batch as we can migrate/experiment with both approaches.
Looks good!
@@ -522,13 +532,30 @@ async fn build_mv_inner( | |||
|
|||
match execution_result { | |||
Ok((maybe_patch, maybe_frontend_object)) => { | |||
// We need to make sure the frontend object is inserted into the store first so that | |||
// a client can directly fetch it without racing against the object's insertion if the | |||
// client does not already have the base object to apply the streaming patch to. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍🏽
943d8c5
to
0a977ff
Compare
0a977ff
to
20714b9
Compare
20714b9
to
4932045
Compare
lib/edda-server/src/change_set_processor_task/materialized_view.rs
Outdated
Show resolved
Hide resolved
40a27d9
to
8e52275
Compare
…not already in the KV store This doesn't reduce the amount of information sent over the wrire to nats on an insert, but nats should be able to short-circuit and avoid doing any write in the case where the object has already been generated for any reason. With the interface provided by the nats KV store, our options for only inserting if the key doesn't already exist looked to be: * Send the (potentially) new thing over the wire and watching for a failure (`create` behaves this way). * Try to get first (and receive the full value). * Iterate over all of the keys to see if the key is already there. Trying the insert using `create` and ignoring the "already exists" error kind seems to be the least-bad of the options.
The idea with streaming patches to the front end is that it will be able to start using the new MVs as soon as they are generated, before _all_ of the data is ready, and there will be a message with the MvIndex patch (message not created/sent yet) that will let the front end know that the stream of patches has finished, and which objects it should have by the end (fetching any it missed the patch for). Right now, these streaming patches are sent in addition to the existing patch batches, but once we start using the streamed patches in the front end, we'll want to stop creating & sending the batched patches as it will be purely redundant from the streamed patches. We will also want to figure out if the existing `IndexUpdate` message should be augmented to include both the from-checksum & patch for the MvIndex, or if we want an entirely new message to handle this.
When streaming patches out to the front end, it would be helpful for the front end's reactivity to be able to send lists early so it can display skeletons for individual items it hasn't gotten the patch information for yet. Right now, there are two "priorities": * Lists * Details (everything else) By default an MV will use the "Details" priority. If the `build_priority` annotation is provided for the MV struct definition, then it will use whatever is specified in the annotation. This also re-structures how MV build tasks are spawned so that we can get everything queued before letting the priority queue handle things from there, to ensure the higher-priority items are always built first. Rather than having to explicitly check if the Change's `entity_kind` is the same as the MV's `trigger_entity`, this check has been abstracted away into the `MaterializedViewInventoryItem` and is checked during the initial queue population. This means that whenever we go to spawn an MV build task, we already know that the change is relevant for that specific MV kind.
Rather than always sending the patches for the front end both as individual patches when each is built, and as a single batch after all have been built, there is now a `SI_EDDA__STREAMING_PATCHES` setting to toggle between the two behaviors. The default behavior is to send the patches as a single batch, making the newer streaming behavior opt-in for development & testing.
8e52275
to
80d9155
Compare
Initial sketch of streaming patches to the front end from edda
The idea with streaming patches to the front end is that it will be able to start using the new MVs as soon as they are generated, before all of the data is ready, and there will be a message with the MvIndex patch (message not created/sent yet) that will let the front end know that the stream of patches has finished, and which objects it should have by the end (fetching any it missed the patch for).
Right now, these streaming patches are sent in addition to the existing patch batches, but once we start using the streamed patches in the front end, we'll want to stop creating & sending the batched patches as it will be purely redundant from the streamed patches.
We will also want to figure out if the existing
IndexUpdate
message should be augmented to include both the from-checksum & patch for the MvIndex, or if we want an entirely new message to handle this.Try to reduce nats load by only inserting FrontEndObjects if they're not already in the KV store
This doesn't reduce the amount of information sent over the wrire to nats on an insert, but nats should be able to short-circuit and avoid doing any write in the case where the object has already been generated for any reason.
With the interface provided by the nats KV store, our options for only inserting if the key doesn't already exist looked to be:
create
behaves this way).Trying the insert using
create
and ignoring the "already exists" error kind seems to be the least-bad of the options.Use a priority queue to determine order of MV builds in edda
When streaming patches out to the front end, it would be helpful for the front end's reactivity to be able to send lists early so it can display skeletons for individual items it hasn't gotten the patch information for yet.
Right now, there are two "priorities":
By default an MV will use the "Details" priority. If the
build_priority
annotation is provided for the MV struct definition, then it will use whatever is specified in the annotation.This also re-structures how MV build tasks are spawned so that we can get everything queued before letting the priority queue handle things from there, to ensure the higher-priority items are always built first. Rather than having to explicitly check if the Change's
entity_kind
is the same as the MV'strigger_entity
, this check has been abstracted away into theMaterializedViewInventoryItem
and is checked during the initial queue population. This means that whenever we go to spawn an MV build task, we already know that the change is relevant for that specific MV kind.Make streamed patches from edda a config setting instead of always on
Rather than always sending the patches for the front end both as individual patches when each is built, and as a single batch after all have been built, there is now a
SI_EDDA__STREAMING_PATCHES
setting to toggle between the two behaviors. The default behavior is to send the patches as a single batch, making the newer streaming behavior opt-in for development & testing.