Skip to content

feat(docs-site): SEO improvements — OG image, RSS feed, sitemap lastmod, structured data#508

Merged
raymondk merged 13 commits intomainfrom
feat/seo-improvements
Apr 17, 2026
Merged

feat(docs-site): SEO improvements — OG image, RSS feed, sitemap lastmod, structured data#508
raymondk merged 13 commits intomainfrom
feat/seo-improvements

Conversation

@marc0olo
Copy link
Copy Markdown
Member

@marc0olo marc0olo commented Apr 17, 2026

Summary

Closes #507.

  • OG image: og-image.png generated at build time from public/og-image.svg via @resvg/resvg-js (Inter font, dark-mode DFINITY style). The og:image meta tag always references the root URL; CI copies the image from the latest versioned build to the deployment root.
  • Structured data: JSON-LD WebSite + Organization schemas injected on every page via astro.config.mjs.
  • Meta tags: og:image, og:image:alt, twitter:image, robots (index, follow, max-image-preview:large), author added globally.
  • RSS feed: feed.xml generated by astro-agent-docs plugin with git-accurate publish dates per page; linked from the new custom Footer component and via <link rel="alternate"> in <head>.
  • Sitemap lastmod: git commit dates injected into the Starlight-generated sitemap.
  • llms-full.txt: full content dump for RAG pipeline ingestion (complementing the existing llms.txt).
  • robots.txt (CI-generated): dynamically built in publish-root-files from versions.json; allows only the latest version's path, disallows old versions and /main/ (except when no releases exist yet). Not placed in versioned build output.
  • Root sitemap.xml (CI-generated): sitemapindex pointing directly to /${LATEST_VERSION}/sitemap-0.xml (spec-compliant: a sitemapindex must reference sitemaps, not other sitemapindex files).
  • CI publish-root-files: now also copies og-image.png, llms-full.txt, and feed.xml from the latest versioned folder to the deployment root (same pattern as existing llms.txt).
  • Footer component: custom Footer.astro override adds RSS and llms.txt discovery links alongside Starlight's existing edit/pagination links.
  • Docs updated: docs-site/README.md reflects new build pipeline and root-file deployment pattern; task6-docs.md fixes a stale path reference and notes the automatic root-file copy on versions.json merge.

Deployment sequence after merge

  1. Merge this PR → main: SEO changes are live in /main/; CI attempts to copy og-image.png from the latest versioned folder (may be skipped if not yet generated there).
  2. Cherry-pick relevant commits onto docs/v0.2 → push: publish-versioned-docs builds /0.2/ with the new SEO output including og-image.png.
  3. workflow_dispatch on the docs workflow from main (or wait for the next main push): publish-root-files copies og-image.png, llms.txt, llms-full.txt, feed.xml from 0.2/ to the deployment root — https://cli.internetcomputer.org/og-image.png then serves the new image.

Test plan

  • Run cd docs-site && npm ci && npm run build locally — confirm dist/og-image.png, dist/llms.txt, dist/llms-full.txt, dist/feed.xml are generated
  • Run npm run test:versions — confirm root-level feed.xml, llms.txt, robots.txt, sitemap.xml are present alongside versioned folders
  • Confirm og-image.png renders correctly (dark background, ICP CLI headline)
  • Confirm feed.xml is valid RSS (check in a feed reader or validator)
  • Confirm llms-full.txt contains all page content
  • After merging and deploying docs/v0.2: verify https://cli.internetcomputer.org/og-image.png serves the new image
  • Verify og:image meta tag appears on all pages in production

🤖 Generated with Claude Code

…friendly docs

- Generate og-image.png at build time from og-image.svg via @resvg/resvg-js
- Add JSON-LD structured data (WebSite + Organization schemas) to all pages
- Add og:image, og:image:alt, twitter:image, robots, and author meta tags
- Generate RSS 2.0 feed (feed.xml) with git-accurate publish dates per page
- Generate llms-full.txt for RAG pipeline ingestion
- Inject git-accurate <lastmod> dates into the Starlight-generated sitemap
- Add dynamic robots.txt generation in CI (blocks old versioned paths, /main/)
- Add root sitemap.xml index in CI pointing to latest version's sitemap
- Copy og-image.png, llms.txt, llms-full.txt, feed.xml from latest version to root in CI
- Add custom Footer component with RSS and llms.txt discovery links
- Update docs-site/README.md and task6-docs.md to reflect new build/deploy behavior

Closes #507
…friendly docs

- Generate og-image.png at build time from og-image.svg via @resvg/resvg-js
- Add JSON-LD structured data (WebSite + Organization schemas) to all pages
- Add og:image, og:image:alt, twitter:image, robots, and author meta tags
- Generate RSS 2.0 feed (feed.xml) with git-accurate publish dates per page
- Generate llms-full.txt for RAG pipeline ingestion
- Inject git-accurate <lastmod> dates into the Starlight-generated sitemap
- Add dynamic robots.txt generation in CI (blocks old versioned paths, /main/)
- Add root sitemap.xml index in CI pointing to latest version's sitemap
- Copy og-image.png, llms.txt, llms-full.txt, feed.xml from latest version to root in CI
- Add custom Footer component with RSS and llms.txt discovery links
- Update docs-site/README.md and task6-docs.md to reflect new build/deploy behavior

Closes #507
@marc0olo marc0olo force-pushed the feat/seo-improvements branch from 917be90 to 064ef5f Compare April 17, 2026 09:52
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves SEO and content discovery for the versioned Astro/Starlight docs site (cli.internetcomputer.org) by adding social sharing assets, structured metadata, an RSS feed, git-based sitemap lastmod, and CI-generated root-level discovery files aligned with the site’s versioned deployment model.

Changes:

  • Add global SEO metadata (OG/Twitter image, robots/author meta, JSON-LD structured data) and a custom footer with discovery links.
  • Enhance the astro-agent-docs build plugin to generate llms-full.txt, feed.xml, inject sitemap lastmod from git history, and render og-image.png from og-image.svg.
  • Update CI to generate root robots.txt + root sitemap.xml and copy llms-full.txt, feed.xml, and og-image.png from the latest versioned build to the deployment root.

Reviewed changes

Copilot reviewed 7 out of 9 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
docs-site/src/components/Footer.astro Adds footer override with RSS + llms discovery links.
docs-site/public/og-image.svg Adds source SVG for OG image generation.
docs-site/plugins/astro-agent-docs.mjs Adds git-date utilities, generates llms-full.txt + RSS feed, injects sitemap lastmod, and converts SVG→PNG for OG image.
docs-site/package.json Adds @resvg/resvg-js and @fontsource/inter for build-time OG PNG rendering.
docs-site/package-lock.json Locks new dependencies.
docs-site/astro.config.mjs Adds global head tags (RSS, robots, author, OG/Twitter, JSON-LD) and registers custom Footer component.
docs-site/README.md Documents new build artifacts and root-file publishing behavior.
.github/workflows/docs.yml Generates root robots.txt + sitemap.xml and copies new root discovery assets from latest version.
.claude/skills/release/task6-docs.md Updates internal release instructions to reflect new docs deployment paths and root-file copy behavior.
Files not reviewed (1)
  • docs-site/package-lock.json: Language not supported
Comments suppressed due to low confidence (1)

docs-site/astro.config.mjs:15

  • SITE provides a sensible default, but config.site is still set from process.env.PUBLIC_SITE only. When PUBLIC_SITE isn’t set (e.g., local npm run build per the PR test plan), config.site becomes undefined and the agent-docs plugin will emit RSS links like /guides/... (non-canonical/invalid for many feed readers) and may also affect sitemap URL generation. Consider setting site: SITE so local builds consistently produce absolute URLs and the plugin’s siteUrl is populated.
export default defineConfig({
  site: process.env.PUBLIC_SITE,
  // For versioned deployments: /0.1/, /0.2/, etc.
  // PUBLIC_BASE_PATH is set per-version in CI (e.g., /0.2/, /main/)
  base: process.env.PUBLIC_BASE_PATH || '/',
  markdown: {

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread docs-site/src/components/Footer.astro
Comment thread docs-site/astro.config.mjs
Comment thread .github/workflows/docs.yml Outdated
Comment thread docs-site/plugins/astro-agent-docs.mjs
Comment thread docs-site/plugins/astro-agent-docs.mjs
Comment thread docs-site/plugins/astro-agent-docs.mjs
- Use root-absolute URLs for feed.xml and llms.txt in head links and Footer
  so feed readers and agents always discover the canonical root endpoint
- Use site: SITE (with fallback) instead of site: process.env.PUBLIC_SITE
  so siteUrl is always populated in the plugin during local builds
- Add fetch-depth: 0 to all three build job checkouts so git log returns
  accurate dates for sitemap lastmod and RSS pubDate
- Fix robots.txt contradictory Allow/Disallow /main/ when LATEST_VERSION=main
- Strip BOM from per-page .md files when concatenating llms-full.txt
- Memoize getGitDate() results to avoid redundant git log subprocesses
- Use root-absolute siteUrl for agent signaling directive href in HTML pages
…to feat/seo-improvements

# Conflicts:
#	.github/workflows/docs.yml
#	docs-site/astro.config.mjs
#	docs-site/plugins/astro-agent-docs.mjs
#	docs-site/src/components/Footer.astro
…oot files in test script

- Fix atom:link rel="self" in feed.xml to point to ${siteUrl}/feed.xml
  (canonical subscription URL) instead of the versioned path
- Update test-version-switcher.sh to simulate the publish-root-files CI step:
  copies llms.txt, llms-full.txt, feed.xml, og-image.png from the latest
  version folder to dist-test/ root, and generates robots.txt
@marc0olo
Copy link
Copy Markdown
Member Author

Addressed all Copilot review feedback in a follow-up commit:

  • Canonical root URLs: feed.xml and llms.txt links in <head> and the footer now point to ${SITE}/feed.xml / ${SITE}/llms.txt (root-absolute) instead of versioned paths — so feed readers and agents always discover the stable canonical endpoint, not a version-pinned copy
  • site: SITE: fixed so siteUrl is always populated in the plugin even without PUBLIC_SITE set locally
  • fetch-depth: 0: added to all three build job checkouts so git log returns accurate dates for sitemap <lastmod> and RSS <pubDate>
  • robots.txt edge case: fixed contradictory Allow/Disallow /main/ when no releases exist yet
  • BOM stripping: leading BOM stripped from per-page .md files when concatenating llms-full.txt
  • getGitDate memoization: results cached in a module-level Map to avoid redundant git log subprocesses
  • atom:link rel="self": corrected to point to ${siteUrl}/feed.xml (canonical root) instead of the versioned path
  • Agent signaling: llms.txt href in injected HTML directive uses root absolute URL
  • test:versions script: now fully simulates the publish-root-files CI step — copies llms.txt, llms-full.txt, feed.xml, og-image.png from the latest version to the test root, and generates robots.txt and sitemap.xml
  • Sitemap spec compliance: root sitemap.xml now points directly to /${LATEST_VERSION}/sitemap-0.xml instead of sitemap-index.xml — a sitemapindex must reference sitemaps, not other sitemapindex files

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR enhances the docs site’s SEO and discovery for the versioned deployment model (latest version indexed at root), adding social preview assets, RSS/sitemap improvements, and AI-agent discovery outputs.

Changes:

  • Adds global SEO metadata + JSON-LD structured data and a custom Footer with discovery links.
  • Extends the astro-agent-docs build hook to generate llms-full.txt, feed.xml, og-image.png, and inject git-based sitemap <lastmod>.
  • Updates CI to generate root robots.txt + root sitemap.xml and to copy root-level artifacts (llms*, feed.xml, og-image.png) from the latest versioned build.

Reviewed changes

Copilot reviewed 8 out of 10 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
docs-site/test-version-switcher.sh Mirrors CI “latest version” root redirect + root artifact generation for local multi-version testing.
docs-site/src/components/Footer.astro Adds RSS + llms.txt discovery links via a Starlight Footer override.
docs-site/public/og-image.svg Introduces the source SVG used to render the share image PNG at build time.
docs-site/plugins/astro-agent-docs.mjs Generates llms-full.txt, RSS feed, git-based sitemap lastmod, and renders og-image.png via resvg.
docs-site/package.json Adds test:versions script; adds resvg + Inter font dev dependencies.
docs-site/package-lock.json Locks new dev dependencies and transitive additions.
docs-site/astro.config.mjs Adds global meta tags, RSS link, JSON-LD schemas, and wires in the Footer override.
docs-site/README.md Documents new build outputs and the root-file publishing pattern.
.github/workflows/docs.yml Generates root robots.txt + sitemap.xml, copies additional root-level SEO/discovery files, and fetches full git history for git-date features.
.claude/skills/release/task6-docs.md Updates release-process docs to reflect the new root-file copy behavior post-merge.
Files not reviewed (1)
  • docs-site/package-lock.json: Language not supported

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread docs-site/README.md Outdated
Comment thread docs-site/plugins/astro-agent-docs.mjs Outdated
Comment thread docs-site/plugins/astro-agent-docs.mjs Outdated
Comment thread docs-site/plugins/astro-agent-docs.mjs
Comment thread docs-site/astro.config.mjs Outdated
Comment thread .github/workflows/docs.yml Outdated
Comment thread docs-site/test-version-switcher.sh
- Replace execSync shell interpolation with spawnSync args array in getGitDate
  to avoid shell injection on unusual file paths
- Use date.slice(0, 10) for sitemap lastmod instead of toISOString() to avoid
  UTC conversion shifting the date for commits near midnight
- Strip trailing slash from SITE constant to prevent double slashes in URLs
- Clarify robots.txt comment: /main/ is conditionally disallowed, not always
@marc0olo
Copy link
Copy Markdown
Member Author

Addressed the new (non-duplicate) findings from the second review:

  • execSyncspawnSync: getGitDate() now passes file paths as an args array to spawnSync instead of interpolating them into a shell string — avoids breakage on paths with quotes or special characters
  • Sitemap lastmod date: replaced new Date(date).toISOString().split('T')[0] with date.slice(0, 10) — preserves the original calendar date from %cI without UTC conversion shifting it for commits near midnight
  • SITE trailing slash: astro.config.mjs now strips a trailing slash from PUBLIC_SITE to prevent double slashes in generated URLs
  • robots.txt comment: clarified that /main/ is conditionally disallowed (only when a release exists), matching the actual implementation

Not acted on — atom:link rel="self": Copilot suggests matching the versioned URL, but the versioned feed.xml is an intermediate build artifact. We link users to the root canonical URL everywhere, so the root self-link is intentional. The versioned copy is never meant to be a direct subscription URL.

The re-flagged comments (canonical URLs, fetch-depth, BOM, memoization, robots edge case) were already fixed in the previous follow-up — Copilot is re-reviewing the original diff lines.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves the docs site’s SEO and “discovery” surface area in a versioned-deployment environment (latest docs served under a version path, with key SEO files published at the domain root).

Changes:

  • Adds build-time generation of og-image.png, feed.xml, llms-full.txt, and injects git-based <lastmod> into generated sitemaps.
  • Publishes root-level SEO/discovery files via CI (robots.txt, root sitemap.xml, plus copying og-image.png / feed.xml / llms*.txt from the latest version folder).
  • Adds a Starlight Footer override and global head tags (RSS link, OG image meta, robots/author meta, JSON-LD structured data).

Reviewed changes

Copilot reviewed 8 out of 10 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
docs-site/test-version-switcher.sh Extends local version-layout simulation to mirror CI root-file generation/copying.
docs-site/src/components/Footer.astro New Footer override adding RSS + llms.txt discovery links.
docs-site/public/og-image.svg Adds the source SVG used to render the social preview PNG at build time.
docs-site/plugins/astro-agent-docs.mjs Generates llms-full, RSS feed, injects sitemap lastmod from git, renders OG PNG via resvg.
docs-site/package.json Adds a test:versions script and new dev dependencies for OG image generation.
docs-site/package-lock.json Locks new dependencies (@resvg/resvg-js, @fontsource/inter) and transitive changes.
docs-site/astro.config.mjs Sets a default SITE, adds global meta/link tags + JSON-LD, wires in Footer override.
docs-site/README.md Documents new build artifacts and CI root-file publishing behavior.
.github/workflows/docs.yml Generates robots.txt + root sitemap.xml and copies root-level artifacts from latest version.
.claude/skills/release/task6-docs.md Updates release docs to reflect updated deployment paths and root-file copying behavior.
Files not reviewed (1)
  • docs-site/package-lock.json: Language not supported

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread docs-site/src/components/Footer.astro Outdated
Comment thread .github/workflows/docs.yml
Comment thread docs-site/plugins/astro-agent-docs.mjs Outdated
Comment thread docs-site/astro.config.mjs
Comment thread docs-site/src/components/Footer.astro
- Use root-relative /feed.xml and /llms.txt for head discovery links and
  footer links so local/staging environments don't advertise production URLs;
  absolute URLs are kept only for og:image/twitter:image where required
- Replace CDATA with escapeXml() for RSS item descriptions to avoid invalid
  XML if a description contains the ]]> CDATA terminator sequence
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR enhances SEO and content discoverability for the versioned Astro/Starlight docs deployment (root + /<version>/ folders) by adding a consistent OG image, RSS feed, structured data, robots/sitemap root files, and LLM ingestion artifacts, with CI updates to publish the correct root-level files from the latest version.

Changes:

  • Add global SEO metadata + JSON-LD structured data via Starlight head configuration.
  • Extend the astro-agent-docs build hook to generate llms-full.txt, feed.xml, og-image.png, and inject git-based <lastmod> into sitemap files.
  • Update CI/local tooling to generate/copy root-level files (robots.txt, root sitemap.xml, feed.xml, llms*.txt, og-image.png) based on versions.json.

Reviewed changes

Copilot reviewed 8 out of 10 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
docs-site/astro.config.mjs Adds global meta tags, JSON-LD, RSS/link discovery, and wires in a custom Footer override.
docs-site/plugins/astro-agent-docs.mjs Generates llms-full.txt, RSS feed, injects git-based sitemap lastmod, and renders og-image.png from og-image.svg.
.github/workflows/docs.yml Generates root robots.txt + root sitemap index and copies latest version root-discovery files into deployment root; ensures full git history for build jobs.
docs-site/src/components/Footer.astro Adds footer discovery links for RSS and llms.txt.
docs-site/public/og-image.svg Adds source SVG for OG image generation.
docs-site/test-version-switcher.sh Updates local multi-version simulation to mirror new root-file behavior.
docs-site/package.json Adds test:versions script; adds build-time dependencies for OG image rendering.
docs-site/package-lock.json Locks newly added dependencies.
docs-site/README.md Documents updated build artifacts and root-file publishing approach.
.claude/skills/release/task6-docs.md Updates release process documentation for the new root-file copy behavior.
Files not reviewed (1)
  • docs-site/package-lock.json: Language not supported

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread docs-site/plugins/astro-agent-docs.mjs Outdated
@raymondk raymondk merged commit 5bd1485 into main Apr 17, 2026
93 checks passed
@raymondk raymondk deleted the feat/seo-improvements branch April 17, 2026 17:27
marc0olo added a commit that referenced this pull request Apr 17, 2026
…od, structured data (#508)

* feat(docs-site): add SEO improvements, OG image, RSS feed, and agent-friendly docs

- Generate og-image.png at build time from og-image.svg via @resvg/resvg-js
- Add JSON-LD structured data (WebSite + Organization schemas) to all pages
- Add og:image, og:image:alt, twitter:image, robots, and author meta tags
- Generate RSS 2.0 feed (feed.xml) with git-accurate publish dates per page
- Generate llms-full.txt for RAG pipeline ingestion
- Inject git-accurate <lastmod> dates into the Starlight-generated sitemap
- Add dynamic robots.txt generation in CI (blocks old versioned paths, /main/)
- Add root sitemap.xml index in CI pointing to latest version's sitemap
- Copy og-image.png, llms.txt, llms-full.txt, feed.xml from latest version to root in CI
- Add custom Footer component with RSS and llms.txt discovery links
- Update docs-site/README.md and task6-docs.md to reflect new build/deploy behavior

Closes #507

* feat(docs-site): add SEO improvements, OG image, RSS feed, and agent-friendly docs

- Generate og-image.png at build time from og-image.svg via @resvg/resvg-js
- Add JSON-LD structured data (WebSite + Organization schemas) to all pages
- Add og:image, og:image:alt, twitter:image, robots, and author meta tags
- Generate RSS 2.0 feed (feed.xml) with git-accurate publish dates per page
- Generate llms-full.txt for RAG pipeline ingestion
- Inject git-accurate <lastmod> dates into the Starlight-generated sitemap
- Add dynamic robots.txt generation in CI (blocks old versioned paths, /main/)
- Add root sitemap.xml index in CI pointing to latest version's sitemap
- Copy og-image.png, llms.txt, llms-full.txt, feed.xml from latest version to root in CI
- Add custom Footer component with RSS and llms.txt discovery links
- Update docs-site/README.md and task6-docs.md to reflect new build/deploy behavior

Closes #507

* fix(docs-site): address Copilot review feedback on SEO implementation

- Use root-absolute URLs for feed.xml and llms.txt in head links and Footer
  so feed readers and agents always discover the canonical root endpoint
- Use site: SITE (with fallback) instead of site: process.env.PUBLIC_SITE
  so siteUrl is always populated in the plugin during local builds
- Add fetch-depth: 0 to all three build job checkouts so git log returns
  accurate dates for sitemap lastmod and RSS pubDate
- Fix robots.txt contradictory Allow/Disallow /main/ when LATEST_VERSION=main
- Strip BOM from per-page .md files when concatenating llms-full.txt
- Memoize getGitDate() results to avoid redundant git log subprocesses
- Use root-absolute siteUrl for agent signaling directive href in HTML pages

* fix(docs-site): use canonical root URL for feed atom:self; simulate root files in test script

- Fix atom:link rel="self" in feed.xml to point to ${siteUrl}/feed.xml
  (canonical subscription URL) instead of the versioned path
- Update test-version-switcher.sh to simulate the publish-root-files CI step:
  copies llms.txt, llms-full.txt, feed.xml, og-image.png from the latest
  version folder to dist-test/ root, and generates robots.txt

* docs(docs-site): mention test-version-switcher.sh in README

* chore(docs-site): add test:versions npm script for version switcher testing

* fix(docs-site): generate root sitemap.xml in test:versions script

* fix(docs-site): point root sitemap.xml directly to sitemap-0.xml for spec compliance

* fix(docs-site): address second round of Copilot review feedback

- Replace execSync shell interpolation with spawnSync args array in getGitDate
  to avoid shell injection on unusual file paths
- Use date.slice(0, 10) for sitemap lastmod instead of toISOString() to avoid
  UTC conversion shifting the date for commits near midnight
- Strip trailing slash from SITE constant to prevent double slashes in URLs
- Clarify robots.txt comment: /main/ is conditionally disallowed, not always

* docs(docs-site): clarify robots.txt /main/ conditional behavior in README

* fix(docs-site): address third round of Copilot review feedback

- Use root-relative /feed.xml and /llms.txt for head discovery links and
  footer links so local/staging environments don't advertise production URLs;
  absolute URLs are kept only for og:image/twitter:image where required
- Replace CDATA with escapeXml() for RSS item descriptions to avoid invalid
  XML if a description contains the ]]> CDATA terminator sequence

* chore(docs-site): fix step numbering in astro-agent-docs build hook comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

chore: SEO and metadata improvements for cli.internetcomputer.org

4 participants