Skip to content

fix: md to pdf#2760

Open
tomaszantas wants to merge 8 commits intolangfuse:mainfrom
Altalogy:fix/md-to-pdf
Open

fix: md to pdf#2760
tomaszantas wants to merge 8 commits intolangfuse:mainfrom
Altalogy:fix/md-to-pdf

Conversation

@tomaszantas
Copy link
Copy Markdown
Contributor

@tomaszantas tomaszantas commented Apr 1, 2026

Description

Fix api/md-to-pdf.

image image

@vercel
Copy link
Copy Markdown

vercel bot commented Apr 1, 2026

@tomaszantas is attempting to deploy a commit to the langfuse Team on Vercel.

A member of the Team first needs to authorize it.

@tomaszantas tomaszantas marked this pull request as ready for review April 1, 2026 16:32
Copy link
Copy Markdown

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. bug Something isn't working labels Apr 1, 2026
@felixkrrr
Copy link
Copy Markdown
Contributor

@claude pls review

@felixkrrr
Copy link
Copy Markdown
Contributor

@jannikmaierhoefer can u pls approve the dev deployment?

@felixkrrr
Copy link
Copy Markdown
Contributor

@claude pls review

@jannikmaierhoefer jannikmaierhoefer self-requested a review April 2, 2026 08:51
Copy link
Copy Markdown
Contributor

@felixkrrr felixkrrr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking into this — the md-to-pdf endpoint is definitely broken in production and the Chromium bundling fixes are solid work. A few things need to be addressed before we can merge.

The core issue

langfusePathToMdSrcPath() patches one consumer (the md-to-pdf API route), but the underlying problem affects all .md URL access — the "Copy as Markdown" button, direct .md URLs, and content negotiation are all broken for the same pages.

We tested production and found 30 broken .md URLs:

23 marketing pages (all except /support.md which has a one-off beforeFiles rewrite):
/about.md, /careers.md, /cn.md, /community.md, /cookie-policy.md, /enterprise.md, /find-us.md, /imprint.md, /jp.md, /jp-cloud.md, /kr.md, /non-profit.md, /oss-friends.md, /press.md, /pricing.md, /pricing-self-host.md, /privacy.md, /research.md, /startups.md, /talk-to-us.md, /terms.md, /watch-demo.md, /wrapped.md

7 user story pages (URL path is /users/* but md-src/ has them under customers/):
/users.md, /users/canva.md, /users/cresta.md, /users/khan-academy.md, /users/magic-patterns-ai-design-tools.md, /users/merckgroup.md, /users/sumup.md

The root cause is in scripts/copy_md_sources.js — it copies files preserving the content/ directory structure, but the URL structure differs for these sections:

Content path md-src/ path (current) URL path Result
content/marketing/terms.mdx md-src/marketing/terms.md /terms.md → rewrite → md-src/terms.md 404
content/customers/canva.mdx md-src/customers/canva.md /users/canva.md → rewrite → md-src/users/canva.md 404

Suggested fix

Fix copy_md_sources.js to copy files to paths matching the URL structure, not the content directory:

  • content/marketing/terms.mdxmd-src/terms.md
  • content/customers/canva.mdxmd-src/users/canva.md

The mapping from content directory → URL path already exists in lib/source.ts (marketing baseUrl: "", customers baseUrl: "/users"), so the copy script should use the same source of truth.

This way the existing generic rewrite /:path*.md → /md-src/:path*.md works for all consumers without any remapping code. langfusePathToMdSrcPath(), lib/marketing-slugs.ts, and the one-off /support.md beforeFiles rewrite all become unnecessary.

What to keep

  • outputFileTracingIncludes for Chromium binaries in next.config.mjs — this is correct and necessary.
  • Improved error handling/logging for chromium.executablePath() and the SPARTICUZ_CHROMIUM_BIN_DIR override — solid additions.

What to change/remove

  • tsconfig.tsbuildinfo — remove from the commit, it's a 1.4MB build artifact.
  • langfusePathToMdSrcPath() and lib/marketing-slugs.ts — should become unnecessary once the copy script is fixed.
  • The /support.md beforeFiles rewrite already on main — can be removed as part of this fix.

@felixkrrr
Copy link
Copy Markdown
Contributor

Correction to my review above: The "Copy as Markdown" button only exists on docs-style pages (docs, self-hosting, integrations, guides, library) — not on marketing pages or user stories. So the broken .md URLs for marketing/user-story pages don't affect that button.

The practical impact of the 30 broken .md URLs is:

  • md-to-pdf — the "download as PDF" links on /terms and /privacy (which this PR targets)
  • Direct .md access — anyone (or any LLM/tool) appending .md to a marketing or user-story URL
  • MCP getLangfuseDocsPage — if it fetches .md URLs for these pages

The rest of the review stands as-is.

@felixkrrr
Copy link
Copy Markdown
Contributor

To clarify the above — here's why fixing this inside the API route isn't the right layer.

The .md URLs are broken for 30 pages in production (23 marketing + 7 user stories). The root cause is a mismatch in scripts/copy_md_sources.js: it copies files preserving the content/ directory structure (content/marketing/terms.mdxmd-src/marketing/terms.md), but the URL structure serves them at a different path (/terms). So the generic rewrite /:path*.md → /md-src/:path*.md fails.

This PR adds langfusePathToMdSrcPath() to remap paths inside the md-to-pdf API route. The problem with that approach:

  1. It only fixes one consumer. The same .md URLs are used by direct access (anyone appending .md), the MCP server's getLangfuseDocsPage tool, and content negotiation. Each would need its own copy of the remapping logic.
  2. It introduces a hardcoded MARKETING_SLUGS list that has to stay in sync with content/marketing/ manually — a maintenance trap. Every new marketing page would silently break until someone remembers to update the list.
  3. It duplicates routing knowledge that already exists in lib/source.ts (marketing baseUrl: "", customers baseUrl: "/users"). The copy script should use the same source of truth instead of re-deriving it.

If we fix copy_md_sources.js to output files at URL-matching paths (content/marketing/terms.mdxmd-src/terms.md, content/customers/canva.mdxmd-src/users/canva.md), the generic rewrite handles everything — no remapping code, no slug list, and all consumers work automatically.

(Correction to my earlier review: the "Copy as Markdown" button only exists on docs-style pages, not marketing/user-story pages, so it's not directly affected here.)

@felixkrrr felixkrrr dismissed their stale review April 2, 2026 09:40

used suggest changes falsely

@felixkrrr felixkrrr removed the request for review from jannikmaierhoefer April 2, 2026 09:40
@tomaszantas tomaszantas requested a review from felixkrrr April 2, 2026 13:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants