[miniflare] Expose R2 via an S3-compatible API locally#14280
[miniflare] Expose R2 via an S3-compatible API locally#14280tahmid-23 wants to merge 26 commits into
Conversation
… R2 public endpoint Write methods get 401 (not 405): r2.dev has no authenticated mode, writes go through the S3 API or bindings. Failed preconditions return a bare 412 with no object headers, like r2.dev (whose error responses carry Cloudflare's HTML error pages; locally only the status code is mimicked). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…on the local R2 public endpoint R2's HTTP endpoints only accept a single range with start <= end; anything else (including multiple ranges) is rejected with 400 rather than ignored. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
r2.dev answers ranged HEAD requests with a bodyless 206 and Content-Range; previously the range was ignored for HEAD and a 200 with the full length was returned. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
🦋 Changeset detectedLatest commit: dee1979 The changes in this PR will be included in the next version bump. This PR includes changesets to release 6 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
|
Codeowners approval required for this PR:
Show detailed file reviewers
|
create-cloudflare
@cloudflare/deploy-helpers
@cloudflare/kv-asset-handler
miniflare
@cloudflare/pages-shared
@cloudflare/unenv-preset
@cloudflare/vite-plugin
@cloudflare/vitest-pool-workers
@cloudflare/workers-auth
@cloudflare/workers-editor-shared
@cloudflare/workers-utils
wrangler
commit: |
| if (preconditions !== undefined) { | ||
| const recheck = await bucket.get(key, { onlyIf: preconditions }); | ||
| if (recheck === null) { | ||
| return handlers.notFound(); | ||
| } | ||
| if (!("body" in recheck)) { | ||
| return handlers.preconditionFailed(); | ||
| } |
There was a problem hiding this comment.
🚩 Public endpoint 412 responses no longer include object headers
The old public.worker.ts returned c.body(null, { status: 412, headers: objectHeaders(recheck) }) for precondition failures, including ETag, Last-Modified, etc. The refactored code delegates to serveR2Object which calls handlers.preconditionFailed() → c.body(null, 412) with no headers. This is a behavioral change from the old code. The 304 path is unaffected (headers are included via the new Response(null, { status: 304, headers }) in serve.worker.ts:110). Since the PR goal is to match r2.dev behavior and r2.dev may return bare 412s, this is likely intentional, but worth confirming against the real r2.dev endpoint.
Was this helpful? React with 👍 or 👎 to provide feedback.
petebacondarwin
left a comment
There was a problem hiding this comment.
@tahmid-23 - thanks for putting this together. It is indeed a lot of change. I think it is much more likely to be reviewed if we do break it up into the initial bug fixes, and then a second PR to add the S3 compatibility. Apart from anything we will need to get R2 product sign off on this as a feature, whereas bug fixes we can land more simply.
3703a8b to
b14551c
Compare
|
|
||
| export interface ServeHandlers { | ||
| notFound(): Response | Promise<Response>; | ||
| preconditionFailed(): Response; |
There was a problem hiding this comment.
🟡 Public endpoint 412 responses lose object metadata headers (ETag, Last-Modified, Accept-Ranges)
The refactored serveR2Object calls handlers.preconditionFailed() which, for the public endpoint, is () => c.body(null, 412) — returning a bare 412 with no headers. The old public.worker.ts code explicitly included object headers via c.body(null, { status: 412, headers: objectHeaders(recheck) }), providing ETag, Last-Modified, and Accept-Ranges in the 412 response. The ServeHandlers.preconditionFailed signature (preconditionFailed(): Response) takes no arguments, so there is no way for the handler to include the object headers that serveR2Object already computed at serve.worker.ts:74. The 304 path at serve.worker.ts:110 correctly includes these headers, creating an asymmetry. This is a regression from the old behavior; RFC 7232 recommends 412 responses include the entity headers that would accompany a 200.
Prompt for agents
The ServeHandlers.preconditionFailed callback signature takes no arguments, so when serveR2Object calls it at line 104, the already-computed object headers (line 74) are discarded. The public endpoint's 412 response previously included ETag, Last-Modified, and Accept-Ranges.
To fix, either:
1. Change the preconditionFailed signature to accept a Headers parameter: `preconditionFailed(headers: Headers): Response` and pass the computed headers from line 74. Update the public endpoint handler to use them: `preconditionFailed: (headers) => c.body(null, { status: 412, headers })`. The S3 endpoint handler can ignore the parameter.
2. Or, have serveR2Object build the 412 response directly (like it does for 304 at line 110) and remove preconditionFailed from ServeHandlers, only calling a handler for the S3 endpoint's special XML error response via a different mechanism.
Was this helpful? React with 👍 or 👎 to provide feedback.
…c endpoint `object.range` may carry all keys with some undefined (e.g. `suffix` present but undefined on an offset range), so normalize by value rather than key presence. Suffix ranges (`bytes=-N`) previously fell through to a 200 whose Content-Length reported the full object size against a partial body. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
b14551c to
edefe3b
Compare
@petebacondarwin fair enough. I reordered a bit and put the first 8 commits into #14323. |
edefe3b to
a6519e6
Compare
…ic endpoint The simulator clamps ranges starting at or beyond the object size; r2.dev rejects them with 416. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…dpoint Hono already percent-decodes path params, so decoding them again corrupted keys containing `%`: a literal `%` that does not form a valid escape threw a URIError (surfacing as a bare 500), and keys like `a%2Bb` were silently served as `a+b`. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The public endpoint must fetch with `bucket.get()` even for HEAD (only `get` evaluates conditional headers and ranges), and the 416 path returns before reading the body. Cancel the stream in both cases instead of leaving it open until garbage collection. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…ation Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
When set, the R2 bucket will be served over a local S3-compatible API during local development, authenticated with the configured AWS SigV4 credentials. This commit adds the config surface and validation; the endpoint itself and the wrangler dev wiring follow. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Pure refactor: namespace entries flow through verbatim instead of being rebuilt from a fixed field list, so a plugin can carry extra per-entry fields without forking the helper. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Buckets configured with s3Credentials are served by a new r2:s3 service at /cdn-cgi/local/r2/s3/<bucket-id>. This commit adds the plugin option, the credential-conflict check, the entry-worker routing, and a stub worker that resolves buckets and answers with S3-style XML errors; the actual S3 protocol follows. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…3 endpoint Everything both SigV4 authentication methods (Authorization header and presigned URL query parameters) share: canonical request construction (with S3's single-encoded canonical URI), the signing-key derivation and signature comparison, credential-scope and date parsing with R2's validation order and error messages, and the SignatureDoesNotMatch debug response. verifyRequest() is wired into dispatch but rejects everything with R2's missing-authorization error until the two verifiers land in the following commits. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
… S3 endpoint Implements the Authorization-header authentication method on top of the shared verification core: header field parsing, the x-amz-date / date-header fallback, request-time skew bounds, payload-hash verification, and R2's error responses for each failure. Authenticated requests still answer NotImplemented; operations follow. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…point Implements the SigV4 query-parameter authentication method on top of the shared verification core: required presigned parameters reported together like R2, X-Amz-Expires bounds (403 ExpiredRequest past expiry, a week at most), the UNSIGNED-PAYLOAD canonical request, and X-Amz-Signature exclusion from the canonical query string. The Authorization header keeps precedence over presigned query parameters. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…'s errors for unimplemented surfaces Adds the operation pipeline: detection yields either a terminal Response or a BoundOperation whose screening rules are applied before its handler runs. Detection reproduces R2's routing for the surfaces its S3 endpoint recognizes but does not implement: object/bucket subresource catch-alls with R2's templated "<name> not implemented" errors (including the GetGetBucketPolicyStatus typo), bucket PUT/DELETE subresource routing, the unsigned bucket-POST presigned-post 501, and a header-screening skeleton (x-amz-security-token rejection). With no operations implemented yet, everything else falls back to a screened NotImplemented operation; each operation group replaces that fallback as its detection lands. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Pure refactor: move range parsing, object headers, conditional handling, and range responses out of public.worker.ts into a serveR2Object() shared with the upcoming S3 endpoint, parameterized by endpoint-specific error handlers. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Implements the first two operations, GetObject and HeadObject, served through the shared serveR2Object() with R2's conditional and range fidelity, and binds detected operations to the table entries holding their screening rules and handlers. Reads reject SSE-C headers with R2's InvalidRequest error. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
PutObject and DeleteObject, with R2's Content-MD5 verification, storage-class validation, custom-metadata collection, conditional-write semantics, and write-header screening (the always-rejected write headers, x-amz-acl no-ops, and x-amz-server-side-encryption validation). Includes R2's POST-as-PutObject quirk: the x-amz-copy-source header is ignored on POST, matching real R2. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
HeadBucket, GetBucketLocation, and R2's static bucket-configuration reads (encryption, versioning, tagging, object lock, replication), whose responses are identical for every bucket. XML documents are built with fast-xml-parser, bundled into the worker at build time. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
ListObjects and ListObjectsV2: pagination via markers and continuation tokens, delimiter grouping into CommonPrefixes, max-keys handling, start-after, encoding-type=url, and R2's strict rejection of unknown list search parameters. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Parses and validates the Delete XML document like real R2: MalformedXML for invalid or oversized key lists, strict-but-ignored Quiet validation, Content-MD5 verification, and idempotent deletes with missing keys still reported as Deleted. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Parses and resolves x-amz-copy-source across the bound buckets, maps the x-amz-copy-source-if-* headers onto standard conditionals, honors the COPY/REPLACE metadata directive, and screens SSE-C headers on the copy source. Brings in fast-xml-parser for the XML success documents. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
CreateMultipartUpload, UploadPart, UploadPartCopy, CompleteMultipartUpload, and AbortMultipartUpload, built on the binding's resumeMultipartUpload(). Binding errors are mapped back onto R2's S3 error responses by their v4 codes (workerd only exposes them via the error message), and the simulator's internal-error responses for unknown upload ids are mapped onto NoSuchUpload. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The account-level route verifies the request against each configured credential set (preferring the most specific auth error) and lists the buckets matching the presented credentials. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…the local S3 endpoint Pass the configured credentials through to miniflare's r2Buckets option for local (non-remote) buckets, enabling the S3-compatible endpoint during wrangler dev. With the endpoint now fully wired up, this also adds the miniflare changeset, documents the known gaps vs real R2 in the S3 worker's module header, and aligns the lockfile with the new miniflare devDependencies. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
a6519e6 to
dee1979
Compare
This PR is based off of #14119, which laid the groundwork.
Motivation
R2 provides an S3 compatibility layer. This exposes R2 operations via S3. However, this is currently not replicated in wrangler.
This feature is especially useful for presigned requests, and in particular presigned uploads. Presigned uploads allows a user to directly upload a file to R2, rather than go indirectly through a worker (in production). However, wrangler does not currently support any S3 compatible API, meaning that using presigned uploads in production can't be tested locally (without making your own worker to implement this exact purpose). Otherwise, I do not believe an S3-compatible local API is strictly needed (although I could be mistaken).
Hence, the primary motivation for adding this S3-compatible local API is pre-signed uploads. However, if I'm going to add presigned upload support, I figured I ought to just support S3 to the maximum extent possible, since presigned uploads are fundamentally an S3 concept.
Regardless, having the S3-compatible API locally makes testing easier if using an S3 client is preferred.
Implementation
I realize that this is a sizable change. I did my best effort to split my commits into small chunks, so that each commit is realistically reviewable. The commits build upon each other sequentially / do not revert prior work (up to Add changesets for the local R2 S3 endpoint and public-bucket fixes, e.g. before any review feedback commits).
Public Worker Fixes
The first few commits are series of patches to the existing local public worker. These are minor bugs / inconsistencies with cloudflare's public R2 endpoints. (I wanted to drive parity here, since I didn't feel that the bugs necessarily warranted their own PR. The functionality of these fixes is also equally important for correctness in the S3 API.)
Configuration
We expose the following configuration field in
wrangler.jsonc:{ "r2_buckets": [ { "binding": "BUCKET", "bucket_name": "my-bucket", "experimental_local_s3_credentials": { "accessKeyId": "local-access-key-id", "secretAccessKey": "local-secret-access-key", }, }, ], }If any bucket is configured with
experimental_local_s3_credentials, it will launch the S3 worker atcdn-cgi/local/r2/s3, and use the credentials to authorize requests to the local S3 API.Authorization
We implement AWS's SigV4 signing algorithm on the server-side. This is implemented for both Authorization header authorization (for using the standard s3 client), and for presigned query parameter authorization (for presigned requests). These are verified by using the standard
@smithy/signature-v4package in tests.Operations
The cloudflare docs on S3 compatibility describes which operations are fully supported. We implement these to the maximum extent possible.
However, some features are not implementable in local development. For example, "SSE-C" cannot be supported, because miniflare does not internally expose any API to forward the SSE-C parameter to workerd R2.
index.worker.tscontains a comment describing some of the limitations.I used Claude to determine the error messages and conditions against my public R2-S3 endpoint. I tried to maintain parity with production R2-S3 as much as possible (error message formatting, error codes, order of checks, etc.)
A picture of a cute animal (not mandatory, but encouraged)