Skip to content

feat(import): add design.md import reverse command#40

Open
zagi wants to merge 4 commits intogoogle-labs-code:mainfrom
zagi:feat/reverse_import_with_auto_detection
Open

feat(import): add design.md import reverse command#40
zagi wants to merge 4 commits intogoogle-labs-code:mainfrom
zagi:feat/reverse_import_with_auto_detection

Conversation

@zagi
Copy link
Copy Markdown

@zagi zagi commented Apr 23, 2026

Generates a DESIGN.md from an existing Node.js project by statically analyzing its design sources. No AI, no network — deterministic code analysis that runs in ~5ms on a clean project.

Pipeline

detect framework → scan sources → parse → merge → emit

What it reads

  • package.json / README.md — project name, description, version, and first-paragraph intro. README H1 beats package.json.name; directory basename is the final fallback. Dependencies are also scanned against an ordered table of 18 known icon-library packages (Lucide, Heroicons, Material Symbols, Phosphor, Tabler, Feather, Radix, Font Awesome) to infer icons.library.
  • Tailwind configs (tailwind.config.{js,ts,cjs,mjs}) — loaded via dynamic import (Bun handles TS natively); theme.extend is walked for colors, borderRadius, spacing, fontSize (incl. the [size, meta] tuple form), and fontFamily. Regex fallback on eval errors so malformed configs still surface their color block.
  • CSS custom properties — both Tailwind v4 @theme { } blocks (with prefix-stripping: --color-primarycolors.primary, --spacing-md, --radius-lg, --font-*, --text-*, --leading-*, --tracking-*, --font-weight-*; --breakpoint-* skipped) and legacy :root { } blocks (name-heuristic classification). --icon-* properties (--icon-library, --icon-style, --icon-stroke-width, --icon-grid, --icon-color, --icon-size, --icon-size-<bucket>) are routed into the icons block instead of being misclassified as generic spacing/color tokens.
  • DTCG tokens (tokens.json, design-tokens.json, design_tokens.json, *.tokens.json) — walks $type/$value; only accepts tokens under colors / spacing / rounded / typography top-level sections so per-component dimensions don't pollute the scale. A top-level icons block is also recognised, accepting both $value-wrapped and bare values for library / style / strokeWidth / grid / size / color.

Framework detection

Cosmetic, reported in the UI. Recognizes Next, Nuxt, Vite, SvelteKit, Remix, Astro, Create React App, Gatsby, Angular, Vue CLI, and falls back to generic Node / unknown. Meta-frameworks beat Vite on conflicts.

Scan hygiene

Bounded at depth 5. Skips node_modules, .git, .next, .nuxt, .output, .svelte-kit, .turbo, build, coverage, dist. Also skips vendor trees (public, static, vendor, vendors, third-party, third_party, bundles, charting_library), minified/RTL stylesheets (*.min.css, .rtl.css), and hashed bundler output (..css) — so e.g. a bundled TradingView charting library's 40+ v-rhythm- tokens don't leak into the project's own design system.

Merge

Precedence: CSS → Tailwind → DTCG (later wins because DTCG is most structured). Rebuilds the flat symbolTable the linter expects so the generated state can be round-tripped through lint/export.

For icons specifically, the package.json library heuristic is unshifted to lowest precedence — CSS and DTCG declarations override the dependency-scan guess, since a project may pull lucide-react for one component while declaring "Heroicons" in its design tokens. Within the merge, scalar fields use last-wins and size maps merge element-wise, capped at 256 entries to bound the emitted YAML.

Output

YAML frontmatter (name, description, colors, typography, rounded, spacing, and icons when discovered) plus a markdown body: # Name heading, description, README intro, ## Overview (framework + counts + source summary), per-section bullet lists of the imported tokens — including a ## Iconography section after ## Rounded when icon metadata is present — and a footer inviting the team to edit the prose. Every user-controlled string in the icons block passes through the same sanitizeImportedText pipeline as name and description, neutralising heading injection, HTML, and CR/LF before emission. The frontmatter alone round-trips cleanly through lint and back through export.

Forward compatibility with #44

The icons: frontmatter shape and ## Iconography section match the schema proposed in #44 (feat(spec): add Iconography section and icons.* tokens) field-for-field. Until #44 lands, the main-branch linter silently passes both as unknown content (parser whitelist + section-order rule), so this PR is independently mergeable. Once #44 lands, the same emitted output is actively validated without any code change here. Two follow-ups are planned post-#44: populating icons.size.* / icons.grid / icons.color into the symbolTable (requires the widened ResolvedValue union from #44), and lifting Tailwind/DTCG export from no-op to pass-through if either format ever grows an icon surface.

CLI

design.md import <project>              # writes <project>/DESIGN.md
design.md import <project> --dryRun     # prints to stdout
design.md import <project> --format json # NDJSON progress events

Pretty mode renders live via Ink, showing staged progress (◐/✓/⚠/✗) for detect → scan → parse → merge → write. JSON mode emits one ImportStep per line on stdout for scripts and CI.

Tests

361 passing. Unit tests cover every parser, the framework detector, the source scanner's vendor filtering, the merger, the markdown emitter, project metadata, and the Ink component, including ~38 dedicated icon-discovery cases (package.json library detection, --icon-* classification, DTCG icons block parsing, merger size-map cap, frontmatter and body emission, heading-injection sanitisation, edge cases for strokeWidth: 0 / NaN / unit-suffix / empty buckets). Integration tests (VR-1, VR-2) round-trip examples/paws-and-paths, atmospheric-glass, totality-festival, and four framework fixtures (Next, Vite, Nuxt, plus the new icon-project fixture exercising package.json + CSS + DTCG icon discovery in one run) through import → lint and assert zero linter errors.

Security

All JSON paths route through safeJsonParse (prototype-pollution guard from commit 47dd3dd); the DTCG icons-subtree parser additionally re-skips __proto__ / constructor / prototype keys in the size map as defense-in-depth. The DTCG walker short-circuits the icons subtree at the top level so a deep icons tree can't burn DFS cycles before being filtered. MAX_ICON_SIZE_ENTRIES = 256 caps the merged size map so an attacker-controlled tokens.json can't bloat the emitted YAML. yamlStringify(doc, { indent: 2 }) is pinned so quoting tests stay stable across yaml-package upgrades.

Build

Marks ink, react, and react-devtools-core as --external so Ink 7's devtools import doesn't break the bundler.

  Generates a DESIGN.md from an existing Node.js project by statically
  analyzing its design sources. No AI, no network — deterministic code
  analysis that runs in ~5ms on a clean project.

  ## Pipeline

      detect framework → scan sources → parse → merge → emit

  ## What it reads

  - **package.json / README.md** — project name, description, version,
    and first-paragraph intro. README H1 beats package.json.name;
    directory basename is the final fallback.
  - **Tailwind configs** (`tailwind.config.{js,ts,cjs,mjs}`) — loaded
    via dynamic import (Bun handles TS natively); `theme.extend` is
    walked for colors, borderRadius, spacing, fontSize (incl. the
    `[size, meta]` tuple form), and fontFamily. Regex fallback on
    eval errors so malformed configs still surface their color block.
  - **CSS custom properties** — both Tailwind v4 `@theme { }` blocks
    (with prefix-stripping: `--color-primary` → `colors.primary`,
    `--spacing-md`, `--radius-lg`, `--font-*`, `--text-*`, `--leading-*`,
    `--tracking-*`, `--font-weight-*`; `--breakpoint-*` skipped) and
    legacy `:root { }` blocks (name-heuristic classification).
  - **DTCG tokens** (`tokens.json`, `design-tokens.json`,
    `design_tokens.json`, `*.tokens.json`) — walks `$type`/`$value`;
    only accepts tokens under `colors` / `spacing` / `rounded` /
    `typography` top-level sections so per-component dimensions don't
    pollute the scale.

  ## Framework detection

  Cosmetic, reported in the UI. Recognizes Next, Nuxt, Vite, SvelteKit,
  Remix, Astro, Create React App, Gatsby, Angular, Vue CLI, and falls
  back to generic Node / unknown. Meta-frameworks beat Vite on conflicts.

  ## Scan hygiene

  Bounded at depth 5. Skips node_modules, .git, .next, .nuxt, .output,
  .svelte-kit, .turbo, build, coverage, dist. Also skips vendor trees
  (public, static, vendor, vendors, third-party, third_party, bundles,
  charting_library), minified/RTL stylesheets (*.min.css, *.rtl.css),
  and hashed bundler output (<name>.<hash>.css) — so e.g. a bundled
  TradingView charting library's 40+ v-rhythm-* tokens don't leak into
  the project's own design system.

  ## Merge

  Precedence: CSS → Tailwind → DTCG (later wins because DTCG is most
  structured). Rebuilds the flat symbolTable the linter expects so the
  generated state can be round-tripped through lint/export.

  ## Output

  YAML frontmatter (name, description, colors, typography, rounded,
  spacing) plus a markdown body: `# Name` heading, description, README
  intro, `## Overview` (framework + counts + source summary),
  per-section bullet lists of the imported tokens, and a footer
  inviting the team to edit the prose. The frontmatter alone
  round-trips cleanly through `lint` and back through `export`.

  ## CLI

      design.md import <project>              # writes <project>/DESIGN.md
      design.md import <project> --dryRun     # prints to stdout
      design.md import <project> --format json # NDJSON progress events

  Pretty mode renders live via Ink, showing staged progress (◐/✓/⚠/✗)
  for detect → scan → parse → merge → write. JSON mode emits one
  ImportStep per line on stdout for scripts and CI.

  ## Tests

  275 passing. Unit tests cover every parser, the framework detector,
  the source scanner's vendor filtering, the merger, the markdown
  emitter, project metadata, and the Ink component. Integration tests
  (VR-1, VR-2) round-trip examples/paws-and-paths, atmospheric-glass,
  totality-festival, and three framework fixtures (Next, Vite, Nuxt)
  through import → lint and assert zero linter errors.

  ## Build

  Marks ink, react, and react-devtools-core as --external so Ink 7's
  devtools import doesn't break the bundler.
@google-cla
Copy link
Copy Markdown

google-cla Bot commented Apr 23, 2026

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

  Threat model: user runs `design.md import` on a repo they don't fully
  trust. Every file under the project root is attacker-controlled.

  - safe-eval: replace dynamic import() with vm.runInNewContext + inert-
    clone of exports (strips getters/functions/proxies) to block the
    Error.prepareStackTrace realm escape. ReDoS-proof TS type stripping
    (linear-time negated class; old `as` regex was O(n²), 28s on 50k
    stacked casts).
  - safe-write: O_NOFOLLOW + lstat + realpath containment so a planted
    `DESIGN.md -> ~/.zshrc` symlink cannot redirect the write.
  - source-scanner: lstatSync + skip symlinks so an `evil.tokens.json
    -> /etc/passwd` symlink cannot exfiltrate host files into DESIGN.md.
  - safe-json: JSON.parse reviver drops __proto__/constructor/prototype;
    framework-detector builds a null-prototype deps map.
  - markdown-emitter: sanitize description/README intro (collapse
    newlines, escape HTML and leading `#`) and wrap README intro in a
    blockquote so downstream LLM consumers attribute it to the repo.
  - error-sanitize: stderr defaults to {code}-only; --verbose opts into
    a path-redacted message. Redaction handles unicode, spaces, and
    URLs without over-matching.
  - runImport canonicalizes projectPath via realpathSync once and uses
    the same root for scan + write containment (no TOCTOU split).

  Red-team verified: getter RCE, symlink overwrite, symlink scan escape,
  and __proto__ pollution all blocked in one malicious repo. 319 tests.
zagi added 2 commits April 25, 2026 00:43
Teaches the import pipeline (detect → scan → parse → merge → emit) to
actively discover icon-token metadata from package.json, CSS custom
properties, and DTCG token files, instead of leaving icons as a no-op.

The emitted YAML frontmatter and `## Iconography` body section follow
the shape the parallel iconography-spec PR (google-labs-code#44) will validate, so once
that PR lands on main these imported files validate without changes.
Until then, the main-branch linter silently ignores both the unknown
`icons:` frontmatter key and the unknown `## Iconography` section.

Discovery sources
  • package.json — 18 known icon packages map to display library names
    (e.g. lucide-react → "Lucide", @heroicons/react → "Heroicons").
  • CSS — `--icon-library`, `--icon-style`, `--icon-stroke-width`,
    `--icon-stroke`, `--icon-grid`, `--icon-color`, `--icon-size`,
    `--icon-size-<bucket>` are routed into the icons block instead of
    being misclassified as generic spacing/color tokens.
  • DTCG — top-level `icons` block with library/style/strokeWidth/grid/
    size/color, accepting both `$value`-wrapped and bare values.

Pipeline
  • New `IconsData` type carried internally by the importer (not part
    of the linter's `DesignSystemState`, which has no icons field on
    main). New `MergedState extends DesignSystemState` + widened
    `PartialState` thread the icons data through merger → emitter.
  • Merger reconciles all three sources field-by-field (last wins) and
    merges size maps element-wise. Package.json icons is unshifted to
    LOWEST precedence so CSS/DTCG explicit declarations override the
    dependency heuristic; name/description from package.json keep their
    pre-existing HIGHEST precedence.
  • Emitter writes the `icons:` YAML block after spacing/rounded and
    the `## Iconography` body section after `## Rounded`. Every
    user-controlled string passes through `sanitizeImportedText` to
    prevent heading-injection.

Security
  • All JSON paths use `safeJsonParse` (prototype-pollution guard).
  • DTCG `parseIconsSubtree` re-skips `__proto__/constructor/prototype`
    in the size map as defense-in-depth.
  • `MAX_ICON_SIZE_ENTRIES = 256` caps an attacker-controlled size map
    so the emitted YAML cannot be unboundedly bloated.
  • The DTCG walker short-circuits the icons subtree to avoid wasted
    DFS over a deep icons tree the walker would discard anyway.
  • `yamlStringify(doc, { indent: 2 })` pinned so quoting tests stay
    stable across yaml-package upgrades.

Tests
  • 38 new unit tests across project-metadata, css-var-parser,
    dtcg-parser, merger, and markdown-emitter; covers each parser's
    happy path, rejection of malformed input, and the empty-state
    omission of the icons field.
  • New e2e fixture `icon-project/` with package.json + CSS + DTCG
    proves all three sources merge additively and the emitted DESIGN.md
    passes `lint` with zero errors.
  • Edge cases pinned: strokeWidth: 0 survives emission, NaN/Infinity
    rejected at parse, `--icon-size-` (empty bucket) returns null,
    `1.5px` strokeWidth (unit suffix) rejected, empty/null DTCG icons
    objects yield no icons field.

Out of scope (deferred follow-ups)
  • `symbolTable` does NOT yet contain `icons.*` entries — the linter's
    `ResolvedValue` union (color | dimension | typography | string)
    cannot represent an icons block on main. After PR google-labs-code#44 lands and
    `ResolvedIcons` enters `ResolvedValue`, a follow-up will populate
    icons.size.*, icons.grid, and icons.color.
  • Tailwind/DTCG export remains a no-op for icons; neither format has
    an icon surface in v1, matching the stated behavior in PR google-labs-code#44.

Refs: google-labs-code#41, google-labs-code#44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant