Skip to content

refactor(email-core,provider-microsoft,provider-gmail,email-mcp): harden download_attachment (#68)#69

Merged
stevenobiajulu merged 1 commit intomainfrom
68-harden-download-attachment-20260501
May 2, 2026
Merged

refactor(email-core,provider-microsoft,provider-gmail,email-mcp): harden download_attachment (#68)#69
stevenobiajulu merged 1 commit intomainfrom
68-harden-download-attachment-20260501

Conversation

@stevenobiajulu
Copy link
Copy Markdown
Member

Summary

Hardening pass on the download_attachment MCP tool added in #67. Addresses all 5 items raised by Gemini + Codex peer review (issue #68).

# Item Fix
1 MCP transport: base64 stuffed inside JSON-text envelope download_attachment returns [text, resource] content array — bytes flow as a typed resource.blob with URI attachment://<mailbox>/<messageId>/<attachmentId>
2 Missing URL encoding on Graph IDs New encodeGraphPathId helper applied at every path-segment ID interpolation (15+ sites); folder names too
3 Base64 silent corruption Whitespace-strip + base64 character-set check + decoded-length vs Graph-reported size mismatch check
4 Lossy filename sanitization list_attachments and download_attachment now return both filename (sanitized) and original_filename (raw)
5 downloadAttachment provider contract too narrow Reshaped to return {content, filename, mimeType, size} — single round-trip, fresh metadata, race-deleted attachments map cleanly to ATTACHMENT_NOT_FOUND via new AttachmentNotFoundError sentinel (Graph 404 + Gmail 404 both mapped)

Gmail provider also gets filename-fallback parity with collectPayloadContent (filename → Content-ID → attachment-<path>) so CID-only inline parts don't regress to empty names after the contract reshape.

Test plan

  • npm run build — clean across all 4 packages
  • npm run lint --workspaces --if-present — clean
  • npm run test:run610 passing (was 580 before; +30 added)
    • email-core: 248
    • email-mcp: 163
    • provider-gmail: 68
    • provider-microsoft: 131
  • npm run check:spec-coverage — 186/186
  • npm run check:server-json, check:gemini-extension-manifest — pass
  • Manual end-to-end against a live mailbox once published — verify the resource content reaches an MCP client correctly, encoded message IDs work against real Outlook mailboxes

New tests

  • Microsoft provider: URL-encoding for downloadAttachment, listAttachments, getMessage (cross-method verification with /, +, =); malformed-base64 → GraphApiError; size-mismatch → GraphApiError; 404 → AttachmentNotFoundError; happy-path uses new return shape.
  • Gmail provider: 404 on attachment endpoint → AttachmentNotFoundError; 404 on message endpoint (for part:* synthetic ids) → AttachmentNotFoundError; non-404 errors propagate unchanged; CID-only inline part filename fallback; bare-attachment attachment-<path> fallback.
  • Action layer: non-ASCII filename round-trip via original_filename; explicit AttachmentNotFoundError mapping; updated existing tests for new contract shape.
  • MCP server: download_attachment emits text + resource content items on success; failure path stays single text envelope.

Reviewed by

Gemini and Codex via /peer-review CLI, both pre-implementation (scope framing) and post-implementation. Critical fixes from the post-implementation review:

  • Gemini caught the base64 regex rejecting MIME-style line-broken contentBytes (whitespace now stripped before validation)
  • Codex caught that Gmail 404s were only mapped on the "missing data" branch (full getMessage + getAttachment 404 mapping added)
  • Codex caught the Gmail filename-fallback regression for CID-only inline parts (parity with collectPayloadContent restored)

Both reviewers endorsed keeping: AttachmentNotSupportedError sentinel, OData microsoft.graph.fileAttachment/contentBytes cast, runtime NOT_SUPPORTED for providers without the capability, and the handleToolCall string special-case for download_attachment (acceptable for v1.1; refactor to a generic content-formatter hook deferred).

Closes #68. Ref #67.

…den download_attachment

Addresses all 5 items from peer review of #67 (issue #68):

1. MCP transport: download_attachment now returns embedded `resource` content
   (typed binary blob) alongside text metadata, instead of stuffing base64
   inside the JSON-stringified text envelope. URI scheme:
   attachment://<mailbox>/<messageId>/<attachmentId>.

2. URL encoding: added `encodeGraphPathId` helper in the Microsoft provider
   and applied it at every path-segment ID interpolation (15+ sites). Real
   Graph IDs containing `/`, `+`, `=` no longer 400/404 the request.

3. Base64 validation: Microsoft provider now strips whitespace, validates the
   base64 character set, and compares decoded length against Graph's reported
   `size`. Mismatches throw rather than silently returning truncated bytes.

4. original_filename: list_attachments and download_attachment outputs now
   carry both `filename` (sanitized) and `original_filename` (raw). Preserves
   non-ASCII names like `Räsumé.pdf` for international users.

5. Provider contract reshape: EmailAttachmentHandler.downloadAttachment now
   returns `{content, filename, mimeType, size}` instead of just `Buffer`.
   Single round-trip; fresh metadata on every call. Race-deleted attachments
   surface as ATTACHMENT_NOT_FOUND via new `AttachmentNotFoundError` sentinel
   (mapped from Graph 404 and Gmail 404).

Gmail provider also gets filename-fallback parity with collectPayloadContent
(filename → Content-ID → attachment-<path>) so CID-only inline parts don't
regress to empty names.

Tested: 610 passing (was 580). Added URL-encoding tests for downloadAttachment,
listAttachments, getMessage; malformed-base64 + size-mismatch tests; Gmail
404 mapping (message + attachment endpoints); CID and bare-attachment filename
fallbacks; non-ASCII filename round-trip; new MCP resource-content shape.

Pre-PR checks pass: build, lint, test:run, check:spec-coverage (186/186),
check:server-json, check:gemini-extension-manifest.

Reviewed by Gemini and Codex via /peer-review. Both pre-implementation
(framing scope) and post-implementation (catching the base64-whitespace bug
and the Gmail 404 / filename-fallback gaps).

Closes: #68
Ref: #67
@stevenobiajulu stevenobiajulu enabled auto-merge (squash) May 2, 2026 03:55
@stevenobiajulu stevenobiajulu merged commit 3cf618d into main May 2, 2026
13 checks passed
@codecov
Copy link
Copy Markdown

codecov Bot commented May 2, 2026

Codecov Report

❌ Patch coverage is 92.98246% with 8 lines in your changes missing coverage. Please review.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
...ackages/provider-gmail/src/email-gmail-provider.ts 87.27% 6 Missing and 1 partial ⚠️
...ges/provider-microsoft/src/email-graph-provider.ts 97.14% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Harden download_attachment: MCP transport, base64 validation, filename + provider contract (peer-review follow-up to #67)

1 participant