Skip to content

feat: Whole Foods Market data connector#23

Open
tnunamak wants to merge 11 commits intomainfrom
feat/wholefoods-connector
Open

feat: Whole Foods Market data connector#23
tnunamak wants to merge 11 commits intomainfrom
feat/wholefoods-connector

Conversation

@tnunamak
Copy link
Member

@tnunamak tnunamak commented Mar 5, 2026

Summary

  • Adds full Whole Foods Market connector (wholefoods/wholefoods-playwright.js) — Playwright-based extraction of order history and product nutrition data via Amazon
  • USDA FDC fallback for nutrition lookup with scoreMatch() validation and Foundation data type fallback
  • Ghost item filtering for Amazon sidebar recommendations (almBrandId detection)
  • Atwater calorie derivation (4-4-9) when store nutrition is incomplete
  • Official Whole Foods icon SVG (icons/wholefoods.svg)
  • Three schemas: wholefoods.profile, wholefoods.orders, wholefoods.nutrition
  • Schemas registered with dev data gateway and validated against real connector output
  • Registry entry with SHA-256 checksums

Test plan

  • Run connector locally against an Amazon/Whole Foods account
  • Validate all 3 schemas against exported data (jsonschema.validate — all pass)
  • Register schemas with dev data gateway (dev.data-gateway.vana.org)
  • Verify icons/wholefoods.svg renders correctly
  • Verify registry checksums match file hashes

🤖 Generated with Claude Code

tnunamak and others added 11 commits March 5, 2026 01:30
Adds full Whole Foods connector (wholefoods/wholefoods-playwright.js) —
Playwright-based extraction of order history and product nutrition data
via Amazon.

- USDA FDC fallback for nutrition lookup with scoreMatch() validation
- Ghost item filtering (Amazon sidebar recommendations)
- Atwater calorie derivation when store nutrition is incomplete
- Official icon SVG (icons/wholefoods.svg)
- Three schemas: wholefoods.profile, wholefoods.orders, wholefoods.nutrition
  (registered with dev data gateway)
- Registry entry with checksums

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace text approximation with official WF wordmark SVG
(tree motif, WHOLE FOODS lettering, MARKET banner).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace hand-drawn SVG with actual App Store icon — reads clearly
at 48x48 and matches the HEB icon approach (PNG from Apple CDN).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
WF order detail pages (/uff/) don't render product thumbnails like
regular Amazon orders, resulting in 99% empty imageUrls. Since the
connector already visits each product's /dp/{ASIN} page for nutrition
scraping, now also grabs the hero image and backfills it into order
items and the nutrition output.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Validated against 5 real Amazon product pages — #landingImage matches
on all of them. Now parses data-a-dynamic-image JSON to pick the
largest resolution instead of using the potentially downscaled src.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Skip Amazon sidebar recommendations (dresses, books, etc.) during
order detail scraping by checking for almBrandId in product URLs.
Validated: 683/683 real WF items have almBrandId, 254/254 ghost items
don't. Zero false positives.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The filter was validated against previously-scraped URL data but broke
live scraping — UFF order detail pages may render links differently
than the stored productUrl values. Ghost item filtering remains in
the UI layer (isGhostItem) where it already works.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Tested against saved UFF order detail page: 32 WF items (all have
almBrandId), 1 ghost item (Amazon sidebar recommendation). Previous
empty scrape was Amazon's server-side bug, not the filter.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Save scrapedImageUrl before WF/USDA fallbacks can overwrite nutritionData
- Return data in empty-orders success path for consistency

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Line-by-line parsing missed nutrients when Amazon renders them as a
single blob (e.g. "10%Total Fat8g25%Saturated Fat5g..."). Switch to
global regex scan, add #nic-nutrition-facts selector, normalize sugar
keys, and add rawNutritionText debug field.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant