feat: blob storage for Mops packages by rvanasa · Pull Request #477 · caffeinelabs/mops

rvanasa · 2026-04-09T20:51:39Z

Migrates the Mops registry from on-chain per-file storage (in dynamically spawned Storage canisters) to Caffeine Immutable Object Storage. Packages are now archived as .tar.gz, uploaded to the Caffeine gateway, and referenced by a single SHA-256 content hash stored on-chain. Existing on-chain packages remain fully accessible via the legacy download path.

Copilot

Pull request overview

This PR migrates the Mops registry publish/download flow from per-file on-chain storage canisters to Caffeine Immutable Object Storage by uploading a single .tar.gz blob and persisting its SHA-256 content hash on-chain, while keeping legacy on-chain packages accessible.

Changes:

Bump API version to 1.4 and extend the main canister interface with blob publish and blob-hash query endpoints.
Update CLI publishing to archive package files and upload the archive to blob storage, then finalize publish with the blob hash.
Extend mops.lock to v4 to support blob-hash-locked packages alongside legacy per-file hashes.

Reviewed changes

Copilot reviewed 20 out of 22 changed files in this pull request and generated 8 comments.

Show a summary per file

File	Description
package-lock.json	Dependency lockfile adjustments.
cli/mops.ts	Bump CLI API version to 1.4.
cli/integrity.ts	Add lockfile v4 with `blobHashes` support and blob-hash integrity checks.
cli/declarations/main/main.did.js	Regenerated Candid JS declarations for new blob APIs.
cli/declarations/main/main.did.d.ts	Regenerated TS declarations for new blob APIs.
cli/declarations/main/main.did	Regenerated Candid interface with blob APIs.
cli/commands/publish.ts	Publish via `.tar.gz` + blob upload + finalize with blob hash.
cli/api/storageClient.ts	New client for merkle-tree hashing + chunked blob upload/download via gateway.
cli/api/downloadPackageFiles.ts	Add blob-based download path (tar extraction) with legacy fallback.
backend/main/types.mo	Add `CreateCertificateResult` type for blob storage protocol.
backend/main/registry/Registry.mo	Store `blobHashByPackageId` and create blob-based package releases.
backend/main/PackagePublisher.mo	Add `startBlobPublish`/`finishBlobPublish` flow for blob publishing.
backend/main/main-canister.mo	Blob storage protocol endpoints, blob publish endpoints, and backup schema bump.

Comments suppressed due to low confidence (1)

backend/main/main-canister.mo:485

Returning #ok([]) from getFileIds when a blob hash exists will cause legacy clients that still depend on getFileIds for installation to silently proceed with an empty file set (resulting in broken installs without an explicit error). Consider returning an error that instructs clients to upgrade/use the blob download path, or providing a backwards-compatible way to download blob-based packages.

  };

  public query func getPackageVersionHistory(name : PackageName) : async [PackageSummaryWithChanges] {
    let ?versions = registry.getPackageVersions(name) else Debug.trap("Package '" # name # "' not found");
    versions
    |> Array.reverse(_)
    |> Array.map<PackageVersion, PackageSummaryWithChanges>(
      _,
      func(version) {
        let ?summary = _getPackageSummaryWithChanges(name, version) else Debug.trap("Package '" # name # "' not found");
        summary;
      },

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-09T20:57:37Z

cli/commands/publish.ts

+  // Create tar.gz archive of package files (excluding docs.tgz)
+  let sourceFiles = files.filter((f) => f !== docsFile);
+  console.log("Creating package archive...");
+  let archivePath = path.join(rootDir, ".mops/.publish-archive.tar.gz");
+  fs.mkdirSync(path.dirname(archivePath), { recursive: true });
+  await tarCreate(
+    {
+      gzip: true,
+      file: archivePath,
+      cwd: rootDir,
+      portable: true,
+    },
+    sourceFiles,
+  );


sourceFiles comes from globbySync() (which uses process.cwd() by default), but the archive is created with cwd: rootDir. If mops publish is executed from a subdirectory, the file paths may not be relative to rootDir, causing tarCreate to miss files or throw. Consider calling globbySync(..., { cwd: rootDir }) (or normalizing each path with path.relative(rootDir, ...)) so the tar input paths match the archive cwd consistently.

Copilot · 2026-04-09T20:57:38Z

cli/commands/publish.ts

+    rootHash = await uploadBlob(archiveData, identity, (pct) => {
+      logUpdate(
+        `Uploading to blob storage ${progressBar(Math.round((pct / 100) * 80) + 2 * (100 / total), 100)}`,


The progress bar math inside the uploadBlob callback can exceed the declared total (e.g. the computed value can go above 100), which makes progressBar() render beyond its intended width and conflicts with the step-based progress updates. Consider using a single consistent scale (either step-based or percent-based) and clamping the value passed to progressBar() to the chosen total.

Suggested change

rootHash = await uploadBlob(archiveData, identity, (pct) => {

logUpdate(

`Uploading to blob storage ${progressBar(Math.round((pct / 100) * 80) + 2 * (100 / total), 100)}`,

const uploadStepStart = 2 * (100 / total);

const uploadStepSize = 100 / total;

rootHash = await uploadBlob(archiveData, identity, (pct) => {

const uploadProgress = Math.min(

100,

Math.max(

uploadStepStart,

Math.round(uploadStepStart + (pct / 100) * uploadStepSize),

),

);

logUpdate(

`Uploading to blob storage ${progressBar(uploadProgress, 100)}`,

Copilot · 2026-04-09T20:57:38Z

cli/api/downloadPackageFiles.ts

+
+  let filesData = new Map<string, Array<number>>();
+
+  await new Promise<void>((resolve, reject) => {
+    let parser = new TarParser();
+    parser.on("entry", (entry: ReadEntry) => {
+      let entryPath = sanitizeTarPath(entry.path);
+      if (!entryPath) {
+        entry.resume();
+        return;
+      }
+      let chunks: Buffer[] = [];
+      entry.on("data", (chunk: Buffer) => chunks.push(chunk));
+      entry.on("end", () => {
+        let data = Buffer.concat(chunks);


archiveData is a .tar.gz, but it is fed directly into tar's parser without gunzipping or enabling gzip support. This will fail to parse (or emit garbage) for gzipped archives. Consider either gunzipping first (e.g. via zlib.gunzipSync) or constructing the parser with gzip support (if supported by the tar version).

Copilot · 2026-04-09T20:57:38Z

cli/api/downloadPackageFiles.ts

+
+  await new Promise<void>((resolve, reject) => {
+    let parser = new TarParser();
+    parser.on("entry", (entry: ReadEntry) => {
+      let entryPath = sanitizeTarPath(entry.path);
+      if (!entryPath) {
+        entry.resume();


Entries from the tar archive are stored using entry.path without validation. A crafted archive can include paths like ../... or absolute paths, which could lead to path traversal when callers later write these files to disk. Consider rejecting absolute paths, normalizing with path.posix.normalize, and ensuring the final path stays within the intended package root; also consider skipping non-file entries (directories, symlinks) explicitly.

Copilot · 2026-04-09T20:57:38Z

cli/api/storageClient.ts

+): Promise<Response> {
+  const controller = new AbortController();
+  const timeout = setTimeout(() => controller.abort(), FETCH_TIMEOUT_MS);
+  return fetch(url, { ...init, signal: controller.signal }).finally(() =>
+    clearTimeout(timeout),
+  );


The default MOPS_STORAGE_PROJECT_ID fallback ("0000000-0000-0000-0000-00000000000") does not match the typical 8-4-4-4-12 UUID-like format (it appears to be the wrong length). If the gateway expects a strict format, this default will cause hard-to-debug upload/download failures. Consider correcting the default value (or making it required and failing fast with a clear error).

Copilot · 2026-04-09T20:57:39Z

backend/main/PackagePublisher.mo

      publishingFiles.delete(publishingId);
      publishingPackages.delete(publishingId);
      publishingPackageFileStats.delete(publishingId);
      publishingTestStats.delete(publishingId);
      publishingNotes.delete(publishingId);
      publishingBenchmarks.delete(publishingId);
      publishingDocsCoverage.delete(publishingId);

      #ok({
        config = publishing.config;
        publication;
        isNewPackage;
      });
    };

+    public func finishBlobPublish(caller : Principal, publishingId : PublishingId, blobHash : Text) : async Result.Result<{ config : PackageConfigV3; publication : PackagePublication; isNewPackage : Bool }, PublishingErr> {
+      assert (not Principal.isAnonymous(caller));
+
+      if (not _isValidBlobHash(blobHash)) {
+        return #err("Invalid blob hash format. Expected 'sha256:<64-lowercase-hex-chars>'");
+      };
+


finishBlobPublish bypasses the existing package size enforcement (MAX_PACKAGE_SIZE) used for per-file uploads, and publishingPackageFileStats stays at its default values for blob publishes. This enables publishing arbitrarily large archives and makes stored file stats inaccurate. Consider passing blob size (and ideally file count) from the CLI to the canister and validating it here (or adding an equivalent enforcement mechanism for blob uploads).

Copilot · 2026-04-09T20:57:39Z

backend/main/registry/Registry.mo

+      let publication = {
+        user = newRelease.userId;
+        time = Time.now();
+        storage = Principal.fromText("aaaaa-aa");
+      };
+      packagePublications.put(packageId, publication);
+
+      blobHashByPackageId.put(packageId, newRelease.blobHash);
+


Blob package publications set publication.storage to the management canister principal (aaaaa-aa). This value is used by the CLI (e.g. getPackageFilesInfo()/install flow) to create a storage actor, and can lead to confusing failures or silent installs when blob-based packages are encountered. Consider using a dedicated sentinel value that clients can reliably detect (or extending the API to omit/optionally provide storage for blob-based packages).

Copilot · 2026-04-09T20:57:39Z

cli/api/storageClient.ts

+
+function buildMerkleTree(
+  fileData: Uint8Array,
+  contentType: string,
+): {
+  chunks: Uint8Array[];
+  chunkHashes: Uint8Array[];
+  blobTree: BlobHashTreeJSON;
+  rootHash: string;
+} {
+  if (fileData.length === 0) {
+    throw new Error("Cannot build merkle tree from empty data");
+  }
+
+  const chunks = splitChunks(fileData);
+  const chunkHashes = chunks.map((c) => chunkHash(c));
+
+  let level: TreeNode[] = chunkHashes.map((h) => ({
+    hash: h,
+    left: null,
+    right: null,
+  }));
+
+  while (level.length > 1) {
+    const nextLevel: TreeNode[] = [];
+    for (let i = 0; i < level.length; i += 2) {
+      const left = level[i]!;
+      const right = level[i + 1] ?? null;
+      const parent = nodeHash(left.hash, right ? right.hash : null);
+      nextLevel.push({ hash: parent, left, right });
+    }
+    level = nextLevel;
+  }
+
+  const chunksRoot = level[0]!;
+
+  const headers: Record<string, string> = {
+    "Content-Type": contentType,
+    "Content-Length": fileData.length.toString(),
+  };
+  const headerLines = Object.entries(headers).map(
+    ([k, v]) => `${k.trim()}: ${v.trim()}`,
+  );
+  headerLines.sort();
+
+  const metaHash = metadataHash(headers);
+  const metaNode: TreeNode = { hash: metaHash, left: null, right: null };
+  const combinedHash = nodeHash(chunksRoot.hash, metaNode.hash);
+  const combinedRoot: TreeNode = {
+    hash: combinedHash,
+    left: chunksRoot,
+    right: metaNode,
+  };
+
+  const blobTree: BlobHashTreeJSON = {
+    tree_type: "DSBMTWH",
+    chunk_hashes: chunkHashes.map(hashToShaString),
+    tree: nodeToJSON(combinedRoot),
+    headers: headerLines,
+  };
+
+  return {
+    chunks,
+    chunkHashes,
+    blobTree,
+    rootHash: hashToShaString(combinedRoot.hash),
+  };


This new storage client implements hashing, certificate retrieval, and multipart upload logic, but there are currently no CLI tests exercising it (couldn’t find any references under cli/tests). Given the risk of subtle protocol/encoding regressions, consider adding unit tests for buildMerkleTree (root hash stability) and mocked-network tests for uploadBlob/downloadBlob error handling/retries.

rvanasa added 2 commits April 9, 2026 14:34

Use blob storage for new Mops packages

3f9f973

Add E2E tests

0d1331f

rvanasa requested a review from Copilot April 9, 2026 20:51

Copilot started reviewing on behalf of rvanasa April 9, 2026 20:52 View session

Fix issues based on security audit

f745fec

Copilot AI reviewed Apr 9, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: blob storage for Mops packages#477

feat: blob storage for Mops packages#477
rvanasa wants to merge 3 commits intomainfrom
registry-blob-storage

rvanasa commented Apr 9, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 9, 2026

Uh oh!

Copilot AI Apr 9, 2026

Uh oh!

Copilot AI Apr 9, 2026

Uh oh!

Copilot AI Apr 9, 2026

Uh oh!

Copilot AI Apr 9, 2026

Uh oh!

Copilot AI Apr 9, 2026

Uh oh!

Copilot AI Apr 9, 2026

Uh oh!

Copilot AI Apr 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

-    rootHash = await uploadBlob(archiveData, identity, (pct) => {
-      logUpdate(
-        `Uploading to blob storage ${progressBar(Math.round((pct / 100) * 80) + 2 * (100 / total), 100)}`,
+    const uploadStepStart = 2 * (100 / total);
+    const uploadStepSize = 100 / total;
+    rootHash = await uploadBlob(archiveData, identity, (pct) => {
+      const uploadProgress = Math.min(
+,
+        Math.max(
+          uploadStepStart,
+          Math.round(uploadStepStart + (pct / 100) * uploadStepSize),
+        ),
+      );
+      logUpdate(
+        `Uploading to blob storage ${progressBar(uploadProgress, 100)}`,

Conversation

rvanasa commented Apr 9, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants