-
Notifications
You must be signed in to change notification settings - Fork 1.3k
[wrangler] Improve asset upload performance with single-file uploads #14305
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,7 @@ | ||
| --- | ||
| "wrangler": patch | ||
| --- | ||
|
|
||
| Improve asset upload performance with single-file uploads | ||
|
|
||
| Asset uploads now use a more efficient per-file upload path when the platform enables it. This is rolled out server-side and requires no configuration changes. Existing upload behavior is unchanged when the new path is not enabled. | ||
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
| @@ -1,4 +1,5 @@ | ||||||
| import assert from "node:assert"; | ||||||
| import { createReadStream } from "node:fs"; | ||||||
| import { readdir, readFile, stat } from "node:fs/promises"; | ||||||
| import * as path from "node:path"; | ||||||
| import { parseStaticRouting } from "@cloudflare/workers-shared/utils/configuration/parseStaticRouting"; | ||||||
|
|
@@ -27,7 +28,7 @@ import prettyBytes from "pretty-bytes"; | |||||
| import { FormData } from "undici"; | ||||||
| import { fetchResult, logger } from "../../shared/context"; | ||||||
| import { hashFile } from "./hash"; | ||||||
| import { isJwtExpired } from "./jwt"; | ||||||
| import { decodeJwtPayload, isJwtExpired } from "./jwt"; | ||||||
| import type { SharedDeployVersionsProps } from "../../shared/types"; | ||||||
| import type { AssetConfig, RouterConfig } from "@cloudflare/workers-shared"; | ||||||
| import type { | ||||||
|
|
@@ -50,6 +51,7 @@ type UploadResponse = { | |||||
|
|
||||||
| // constants same as Pages for now | ||||||
| const BULK_UPLOAD_CONCURRENCY = 3; | ||||||
| const EDGE_KV_UPLOAD_CONCURRENCY = 25; | ||||||
| const MAX_UPLOAD_ATTEMPTS = 5; | ||||||
| const MAX_UPLOAD_GATEWAY_ERRORS = 5; | ||||||
|
|
||||||
|
|
@@ -140,53 +142,83 @@ export const syncAssets = async ( | |||||
| }); | ||||||
| }); | ||||||
|
|
||||||
| const queue = new PQueue({ concurrency: BULK_UPLOAD_CONCURRENCY }); | ||||||
| const useEdgeKvUpload = isEdgeKvUpload(initializeAssetsResponse.jwt); | ||||||
| const uploadBuckets = useEdgeKvUpload | ||||||
| ? assetBuckets.flat().map((entry) => [entry]) | ||||||
| : assetBuckets; | ||||||
|
|
||||||
| const queue = new PQueue({ | ||||||
| concurrency: useEdgeKvUpload | ||||||
| ? getEdgeKvUploadConcurrency(initializeAssetsResponse.jwt) | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. do we need concurrency limits for single-file uploads? i guess probably if there's any level of buffering, and definitely better start there and increase it.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. not sure i'm following: is that what you meant? or some kind of server side concurrency limit maybe?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. i mainly meant that we might not need control concurrency for single-file uploads - we do this for batches uploads because we buffer so much and can easily crush the memory used for the upload, but with a true passthrough single-file upload, maybe we could just never need to limit concurrency? but i know we're not there right now so it's a moot point
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ah I see what you mean. well, the limit now configurable and controllable from EWC. so, we can iteratively bump it higher and monitor for issues. if we want to remove it entirely later, maybe we can! |
||||||
| : BULK_UPLOAD_CONCURRENCY, | ||||||
| }); | ||||||
| const queuePromises: Array<Promise<void>> = []; | ||||||
| let attempts = 0; | ||||||
| const start = Date.now(); | ||||||
| let completionJwt = ""; | ||||||
| let uploadedAssetsCount = 0; | ||||||
|
|
||||||
| for (const [bucketIndex, bucket] of assetBuckets.entries()) { | ||||||
| attempts = 0; | ||||||
| for (const [bucketIndex, bucket] of uploadBuckets.entries()) { | ||||||
| let attempts = 0; | ||||||
| let gatewayErrors = 0; | ||||||
| const doUpload = async (): Promise<UploadResponse> => { | ||||||
| // Populate the payload only when actually uploading (this is limited to 3 concurrent uploads at 50 MiB per bucket meaning we'd only load in a max of ~150 MiB) | ||||||
| // This is so we don't run out of memory trying to upload the files. | ||||||
| const payload = new FormData(); | ||||||
| const uploadedFiles: string[] = []; | ||||||
| for (const manifestEntry of bucket) { | ||||||
| const absFilePath = path.join(assetDirectory, manifestEntry[0]); | ||||||
| uploadedFiles.push(manifestEntry[0]); | ||||||
| payload.append( | ||||||
| manifestEntry[1].hash, | ||||||
| new File( | ||||||
| [(await readFile(absFilePath)).toString("base64")], | ||||||
| manifestEntry[1].hash, | ||||||
| { | ||||||
| // Most formdata body encoders (incl. undici's) will override with "application/octet-stream" if you use a falsy value here | ||||||
| // Additionally, it appears that undici doesn't support non-standard main types (e.g. "null") | ||||||
| // So, to make it easier for any other clients, we'll just parse "application/null" on the API | ||||||
| // to mean actually null (signal to not send a Content-Type header with the response) | ||||||
| type: getContentType(absFilePath) ?? "application/null", | ||||||
| } | ||||||
| ), | ||||||
| manifestEntry[1].hash | ||||||
| ); | ||||||
| } | ||||||
|
|
||||||
| try { | ||||||
| const res = await fetchResult<UploadResponse>( | ||||||
| complianceConfig, | ||||||
| `/accounts/${accountId}/workers/assets/upload?base64=true`, | ||||||
| { | ||||||
| method: "POST", | ||||||
| headers: { | ||||||
| Authorization: `Bearer ${initializeAssetsResponse.jwt}`, | ||||||
| }, | ||||||
| body: payload, | ||||||
| let res: UploadResponse; | ||||||
| if (useEdgeKvUpload) { | ||||||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this is the new codepath: if the gate is on, use the new single-file upload endpoint |
||||||
| const manifestEntry = bucket[0]; | ||||||
| const absFilePath = path.join(assetDirectory, manifestEntry[0]); | ||||||
| const contentType = getContentType(absFilePath); | ||||||
| res = await fetchResult<UploadResponse>( | ||||||
| complianceConfig, | ||||||
| `/accounts/${accountId}/workers/assets/upload/${manifestEntry[1].hash}`, | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. didn't realize we were using the same endpoint but with different semantics, i kinda figured we'd have a new endpoint? not an issue though
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. the old bulk one is: and the new single file-one is: so they are different, unless i'm misunderstanding the question?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. oooo i just wasn't reading closely enough haha. my bad
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. haha no worries! |
||||||
| { | ||||||
| method: "POST", | ||||||
| headers: { | ||||||
| Authorization: `Bearer ${initializeAssetsResponse.jwt}`, | ||||||
| "Content-Type": contentType ?? "application/null", | ||||||
| }, | ||||||
| body: createReadStream(absFilePath), | ||||||
| duplex: "half", | ||||||
| } | ||||||
| ); | ||||||
| } else { | ||||||
| // Populate the payload only when actually uploading (this is limited to 3 concurrent uploads at 50 MiB per bucket meaning we'd only load in a max of ~150 MiB) | ||||||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this is the old codepath: just moved down into the |
||||||
| // This is so we don't run out of memory trying to upload the files. | ||||||
| const payload = new FormData(); | ||||||
| for (const manifestEntry of bucket) { | ||||||
| const absFilePath = path.join(assetDirectory, manifestEntry[0]); | ||||||
| payload.append( | ||||||
| manifestEntry[1].hash, | ||||||
| new File( | ||||||
| [(await readFile(absFilePath)).toString("base64")], | ||||||
| manifestEntry[1].hash, | ||||||
| { | ||||||
| // Most formdata body encoders (incl. undici's) will override with "application/octet-stream" if you use a falsy value here | ||||||
| // Additionally, it appears that undici doesn't support non-standard main types (e.g. "null") | ||||||
| // So, to make it easier for any other clients, we'll just parse "application/null" on the API | ||||||
| // to mean actually null (signal to not send a Content-Type header with the response) | ||||||
| type: getContentType(absFilePath) ?? "application/null", | ||||||
| } | ||||||
| ), | ||||||
| manifestEntry[1].hash | ||||||
| ); | ||||||
| } | ||||||
| ); | ||||||
| res = await fetchResult<UploadResponse>( | ||||||
| complianceConfig, | ||||||
| `/accounts/${accountId}/workers/assets/upload?base64=true`, | ||||||
| { | ||||||
| method: "POST", | ||||||
| headers: { | ||||||
| Authorization: `Bearer ${initializeAssetsResponse.jwt}`, | ||||||
| }, | ||||||
| body: payload, | ||||||
| } | ||||||
| ); | ||||||
| } | ||||||
| uploadedAssetsCount += bucket.length; | ||||||
| logAssetsUploadStatus( | ||||||
| numberFilesToUpload, | ||||||
|
|
@@ -225,7 +257,7 @@ export const syncAssets = async ( | |||||
| throw new FatalError( | ||||||
| `Upload took too long.\n` + | ||||||
| `Asset upload took too long on bucket ${bucketIndex + 1}/${ | ||||||
| initializeAssetsResponse.buckets.length | ||||||
| uploadBuckets.length | ||||||
| }. Please try again.\n` + | ||||||
| `Assets already uploaded have been saved, so the next attempt will automatically resume from this point.`, | ||||||
| { telemetryMessage: "Asset upload took too long" } | ||||||
|
|
@@ -274,6 +306,23 @@ export const syncAssets = async ( | |||||
| return completionJwt; | ||||||
| }; | ||||||
|
|
||||||
| function isEdgeKvUpload(jwt: string): boolean { | ||||||
| try { | ||||||
| return decodeJwtPayload(jwt).edge_kv === true; | ||||||
| } catch { | ||||||
| return false; | ||||||
| } | ||||||
| } | ||||||
|
|
||||||
| export function getEdgeKvUploadConcurrency(jwt: string): number { | ||||||
| try { | ||||||
| const value = Number(decodeJwtPayload(jwt).edge_kv_upload_concurrency); | ||||||
| return value > 0 ? Math.floor(value) : EDGE_KV_UPLOAD_CONCURRENCY; | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🟡 When
Suggested change
Was this helpful? React with 👍 or 👎 to provide feedback.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. it's fine lol we will not set this to a fractional value between 0 and 1 on the server side |
||||||
| } catch { | ||||||
| return EDGE_KV_UPLOAD_CONCURRENCY; | ||||||
| } | ||||||
| } | ||||||
|
|
||||||
| export const buildAssetManifest = async (dir: string) => { | ||||||
| const files = await readdir(dir, { recursive: true }); | ||||||
| logReadFilesFromDirectory(dir, files); | ||||||
|
|
||||||
Uh oh!
There was an error while loading. Please reload this page.