Skip to content

Better mini sitemap for the chatbot#1337

Open
nrichers wants to merge 5 commits into
mainfrom
nrichers/sc-16170/better-mini-sitemap-for-the-chatbot
Open

Better mini sitemap for the chatbot#1337
nrichers wants to merge 5 commits into
mainfrom
nrichers/sc-16170/better-mini-sitemap-for-the-chatbot

Conversation

@nrichers
Copy link
Copy Markdown
Collaborator

@nrichers nrichers commented May 20, 2026

Pull Request Description

What and why?

This PR should improve chatbot responses by adding a product UI to docs map for RAG alongside the existing AGENTS.md docs index.

Response quality improvements that depend on docs RAG should be across the board, but I am hopeful that settings-related questions especially will improve, as settings 1) cover a large surface area in the UI and 2) our docs content architecture has grouped settings info with the human-centric tasks the settings belong to.

Summary of changes

  • Adds site/scripts/generate_chatbot_product_map.py to build a product-aligned map from frontend routes/help links to documentation URLs and H2/H3 section hints (committed as site/llm/chatbot-product-map.md).
  • Wires the generator into site/llm/render.sh and CI so the map and updated AGENTS.md are copied into site/llm/_llm-output/ for LanceDB ingestion.
  • Includes the "Using the documentation" hub page in the LLM corpus by excluding only contributor/style-guide pages under about/contributing/ (Quarto does not re-include files after a directory exclusion).
  • Extends AGENTS.md with a Product UI mapping section describing when Valerie should use the map vs docs-by-topic navigation.
  • Vendors frontend route/help-link data in site/llm/chatbot-product-map-frontend-snapshot.json so CI can build the map without a validmind/frontend checkout.

Fixes sc-16170 — Better mini sitemap for the chatbot

How to test

  • python3 -m unittest discover -s site/scripts -p 'test_generate_chatbot_product_map.py' -v
  • python3 site/scripts/generate_chatbot_product_map.pygit diff --exit-code site/llm/chatbot-product-map.md
  • make render-llm locally
  • Confirm site/llm/_llm-output/chatbot-product-map.md and site/llm/_llm-output/about/contributing/using-the-documentation.md exist; no validmind-community.md or style-guide/ under contributing

To refresh the vendored frontend snapshot after product UI link changes: make -C site refresh-chatbot-product-map (requires local validmind/frontend checkout).

What needs special review?

  • CI map verification and whether the vendored frontend snapshot workflow is clear enough for maintainers (site/llm/README.md).
  • LLM corpus exclusions under about/contributing/ — confirm only contributor/style-guide pages are omitted.

Dependencies, breaking changes, and deployment notes

  • No PR dependencies.
  • Post-merge: LanceDB must re-ingest the LLM corpus for the new map to affect Valerie in production.
  • Refresh chatbot-product-map-frontend-snapshot.json when frontend routes or helpLink values change.

Release notes

Internal — not externalized in release notes.

Checklist

  • What and why
  • Screenshots or videos (Frontend) — N/A
  • How to test
  • What needs special review
  • Dependencies, breaking changes, and deployment notes
  • Labels applied (internal)
  • PR linked to Shortcut
  • Unit tests added (Backend)
  • Tested locally
  • Documentation updated (if required)
  • Environment variable additions/changes documented (if required) — N/A

Generate a product-aligned mini sitemap from frontend routes and help links,
wire it into LLM render output, include the docs IA hub page in the corpus,
and verify the artifact in CI.
@github-actions
Copy link
Copy Markdown
Contributor

Pull requests must include at least one of the required labels: internal, highlight, enhancement, bug, deprecation, documentation. Except for internal, pull requests must also include a description in the release notes section.

@nrichers nrichers changed the title Better mini sitemap for the chatbot (sc-16170) Better mini sitemap for the chatbot May 20, 2026
@github-actions
Copy link
Copy Markdown
Contributor

Pull requests must include at least one of the required labels: internal, highlight, enhancement, bug, deprecation, documentation. Except for internal, pull requests must also include a description in the release notes section.

@nrichers nrichers added the internal Not to be externalized in the release notes label May 20, 2026
nrichers added 3 commits May 19, 2026 18:56
…kout)

Store extracted routes and help links in site/llm/chatbot-product-map-frontend-snapshot.json
so CI builds the map without validmind/frontend access. Refresh locally with
make -C site refresh-chatbot-product-map when product UI links change.
Give maintainers a single entry point for render-llm, product map artifacts,
and when to refresh the vendored frontend snapshot.
Sort doc paths and related-doc suggestions so Linux and macOS produce the
same map, regenerate the committed artifact, and use unittest discover to
avoid the stdlib site module import collision.
@nrichers nrichers requested a review from kam-validmind May 20, 2026 17:41
@github-actions
Copy link
Copy Markdown
Contributor

PR Summary

This pull request introduces a new feature aimed at generating and validating a product-to-documentation map used by the in-app chatbot (Valerie) for retrieval-augmented generation (RAG). The changes include the following key functional enhancements:

  1. A new Python script (site/scripts/generate_chatbot_product_map.py) has been added. This script extracts and correlates frontend routes with corresponding documentation URLs and headings by parsing both the committed frontend snapshot and source files. It leverages regex patterns to identify help links, documentation references, and settings group titles, thereby generating a markdown map (site/llm/chatbot-product-map.md).

  2. New targets have been added to the Makefile:

    • generate-chatbot-product-map: to generate the product map using the pre-committed frontend snapshot.
    • refresh-chatbot-product-map: to regenerate the snapshot and map when there are updates in the frontend repository (requires a local checkout).
  3. The CI workflow (.github/workflows/validate-docs-site.yaml) now includes additional steps to:

    • Verify that the chatbot product map (both markdown and JSON snapshot) is up to date by running the generator and comparing against committed files.
    • Run a unit test suite (python3 -m unittest discover -s site/scripts -p 'test_generate_chatbot_product_map.py' -v) to ensure functionality of the chatbot product map generator.
    • Ensure that the LLM corpus includes the updated chatbot map and documentation IA hub files.
  4. Documentation updates include changes to AGENTS.md and the addition of a README in site/llm to explain the purpose of the product map and instructions on how to regenerate it.

  5. The render script (site/llm/render.sh) has been updated to run the chatbot product map generator and copy the appropriate files into the LLM output directory ensuring consistency between the source and rendered artifacts.

Overall, the PR integrates a robust mechanism to automatically generate, test, and validate a product-to-documentation map that aligns frontend routes with their corresponding documentation. This supports more accurate contextual assistance in the chatbot and streamlines maintenance for documentation updates.

Test Suggestions

  • Run the unit tests using python3 -m unittest discover -s site/scripts -p 'test_generate_chatbot_product_map.py' -v to verify the behavior of the product map generator.
  • Trigger the CI workflow locally (or via a test branch) to ensure that the new GitHub Actions steps correctly detect mismatches in the generated files.
  • Manually run the Makefile targets generate-chatbot-product-map and refresh-chatbot-product-map to verify that the product map is generated as expected.
  • Verify that after a successful run, the chatbot-product-map.md and the associated JSON snapshot are updated and correctly copied into the LLM output directory.

@github-actions
Copy link
Copy Markdown
Contributor

Lighthouse check results

⚠️ WARN: Average accessibility score is 0.87 (required: >0.9) — Check the workflow run

Show Lighthouse scores

Folder depth level checked: 0

Commit SHA: 51ee46d

Modify the workflow to check a different depth:

  • 0: Top-level navigation only — /index.html, /guide/guides.html, ...
  • 1: All first-level subdirectories — /guide/.html, /developer/.html, ...
  • 2: All second-level subdirectories — /guide/attestation/*.html, ...
Page Accessibility Performance Best Practices SEO
/developer/validmind-library.html 0.85 0.68 1.00 0.82
/get-started/get-started.html 0.85 0.70 1.00 0.73
/guide/guides.html 0.85 0.69 1.00 0.82
/index.html 0.93 0.68 1.00 0.82
/releases/all-releases.html 0.86 0.69 1.00 0.73
/support/support.html 0.91 0.69 1.00 0.82
/training/training.html 0.85 0.61 0.96 0.73

@github-actions
Copy link
Copy Markdown
Contributor

Validate docs site

✓ INFO: A live preview of the docs site is available — Open the preview

{
"version": 1,
"generated_at": "2026-05-20T01:55:33.719811+00:00",
"frontend_root": "/Users/nrichers/GitHub/validmind/frontend",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you want your local mac machine path nrichers in this output?

"anchor": null
}
],
"/settings/index.tsx": [
Copy link
Copy Markdown

@kam-validmind kam-validmind May 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is settings/index.tsx a route - seems like a react file instead

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

internal Not to be externalized in the release notes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants