Add EPUB and Word (.docx) export pipeline#172
Conversation
- scripts/build-epub.js — builds a single EPUB from all 44 docs/ files - epub/metadata.yaml — title, author, language, TOC depth, CC BY 4.0 rights - epub/epub.css — clean typesetting (serif body, sans headings, code blocks, tables) - .github/workflows/build-epub.yml — auto-builds on docs/ changes, uploads as artifact - npm run build:epub script added to package.json EPUB is generated locally but not committed (in .gitignore). The workflow artifact is retained for 90 days. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Remove WeasyPrint/PDF (dependency issues) - Add Word .docx output using pandoc --to docx - Both formats built from same preprocessed markdown - CI workflow updated: uploads EPUB and Word artifacts (90-day) - .gitignore updated for .docx - pdf.css kept for reference but not used in build Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Peer Review AssignedHi @taylorarndt! Your PR has been automatically paired with @accesswatch for peer review. For @accesswatch:This is a great opportunity to practice code review skills! Here's what to look for: Content Quality:
Accessibility:
Documentation:
Review Guidelines:
Resources: Pairing by Learning Room Grouping Engine |
There was a problem hiding this comment.
Pull request overview
This PR adds an automated EPUB and Word (.docx) export pipeline for the docs/ folder, enabling accessible reading on e-readers, screen readers, and for printing. The build script preprocesses 44+ markdown files (stripping podcast callouts, rewriting internal links) and runs pandoc to generate both formats. A GitHub Actions workflow automates the build on docs changes.
Changes:
- New
scripts/build-epub.jsbuild script that preprocesses markdown files and invokes pandoc for EPUB and Word output - New
.github/workflows/build-epub.ymlworkflow that triggers on docs changes, builds both formats, uploads artifacts, and commits outputs - New
epub/directory with metadata, EPUB CSS, and PDF CSS files; updated.gitignoreandpackage.json
Reviewed changes
Copilot reviewed 6 out of 7 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
scripts/build-epub.js |
Core build script: reads docs, preprocesses markdown, invokes pandoc for EPUB and DOCX |
.github/workflows/build-epub.yml |
CI workflow: triggers on docs changes, installs pandoc, runs build, uploads artifacts, commits outputs |
epub/metadata.yaml |
Pandoc metadata for the EPUB/DOCX (title, author, TOC settings) |
epub/epub.css |
Stylesheet for the EPUB output |
epub/pdf.css |
Stylesheet for future PDF output (not yet used by build script) |
.gitignore |
Ignores generated .epub and .docx output files |
package.json |
Adds build:epub npm script |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Pattern: > **Listen to Episode N:** ... line(s) | ||
| content = content.replace(/^>[ \t]*\*\*Listen to Episode[^\n]*\n/gm, ''); |
There was a problem hiding this comment.
The podcast callout regex only removes the > **Listen to Episode... line, but the actual callouts in the docs are preceded by a standalone > line (e.g., docs/00-pre-workshop-setup.md line 2). This leaves an orphaned empty blockquote marker > in the preprocessed output, which will render as an empty blockquote element in the EPUB/DOCX.
Consider expanding the pattern to also match and remove the preceding empty blockquote line, e.g. matching ^>\s*\n^>[ \t]*\*\*Listen to Episode[^\n]*\n.
| // Pattern: > **Listen to Episode N:** ... line(s) | |
| content = content.replace(/^>[ \t]*\*\*Listen to Episode[^\n]*\n/gm, ''); | |
| // Pattern: | |
| // > | |
| // > **Listen to Episode N:** ... line | |
| content = content.replace(/^>\s*\n^>[ \t]*\*\*Listen to Episode[^\n]*\n/gm, ''); |
| #!/usr/bin/env node | ||
| /** | ||
| * Build EPUB from docs/ markdown files using Pandoc. | ||
| * Outputs: epub/git-going-with-github.epub |
There was a problem hiding this comment.
The file-level doc comment says "Outputs: epub/git-going-with-github.epub" but the script also outputs epub/git-going-with-github.docx. The docstring should mention both outputs.
| * Outputs: epub/git-going-with-github.epub | |
| * Outputs: epub/git-going-with-github.epub, epub/git-going-with-github.docx |
| const size = (fs.statSync(EPUB_OUT).size / 1024).toFixed(1); | ||
| console.log(`\nEPUB written: epub/git-going-with-github.epub (${size} KB)`); | ||
| } catch (err) { | ||
| console.error('\nPandoc EPUB failed. Is pandoc installed? Run: brew install pandoc'); |
There was a problem hiding this comment.
The error message suggests brew install pandoc as the install command, but the CI workflow runs on ubuntu-latest where apt-get install pandoc is the correct command. This could be confusing for contributors who encounter the error in CI. Consider making the suggestion platform-aware, or at minimum mentioning both brew install pandoc (macOS) and sudo apt-get install pandoc (Linux).
| - name: Commit outputs to repository | ||
| run: | | ||
| git config user.name "github-actions[bot]" | ||
| git config user.email "41898282+github-actions[bot]@users.noreply.github.com" | ||
| git add epub/git-going-with-github.epub epub/git-going-with-github.docx | ||
| if [ -n "$(git status --porcelain)" ]; then | ||
| git commit -m "chore: rebuild EPUB and Word from latest docs" | ||
| git push | ||
| else | ||
| echo "No changes to commit" | ||
| fi |
There was a problem hiding this comment.
The .gitignore file (lines 168-169) ignores epub/git-going-with-github.epub and epub/git-going-with-github.docx, but the workflow step at line 55 runs git add on those same files. Since they are gitignored, git add will silently skip them (they won't be staged), the porcelain check will find no changes, and the commit/push will never happen.
You need to either:
- Remove these entries from
.gitignoreso the files can be committed, or - Use
git add -f epub/git-going-with-github.epub epub/git-going-with-github.docxto force-add ignored files, or - Remove the commit step entirely and rely solely on the artifact uploads.
| const size = (fs.statSync(EPUB_OUT).size / 1024).toFixed(1); | ||
| console.log(`\nEPUB written: epub/git-going-with-github.epub (${size} KB)`); | ||
| } catch (err) { | ||
| console.error('\nPandoc EPUB failed. Is pandoc installed? Run: brew install pandoc'); |
There was a problem hiding this comment.
If the EPUB pandoc build fails, process.exit(1) on line 134 terminates the process immediately without cleaning up the temporary directory created at line 23. The fs.rmSync(TMP, ...) cleanup in the finally block (line 158-159) only covers the docx build.
Consider wrapping both builds in a single try/finally, or adding cleanup before process.exit(1) on line 134.
| console.error('\nPandoc EPUB failed. Is pandoc installed? Run: brew install pandoc'); | |
| console.error('\nPandoc EPUB failed. Is pandoc installed? Run: brew install pandoc'); | |
| try { | |
| fs.rmSync(TMP, { recursive: true, force: true }); | |
| } catch (cleanupErr) { | |
| // ignore cleanup errors | |
| } |
| '--split-level=1', | ||
| '--syntax-highlighting=tango', |
There was a problem hiding this comment.
--syntax-highlighting=tango (line 122) and --split-level=1 (line 121) do not appear to be valid pandoc command-line options. The standard pandoc equivalents are --highlight-style=tango and --epub-chapter-level=1 respectively. Note that epub-chapter-level: 1 is already set in epub/metadata.yaml, so the command-line flag may be redundant once corrected.
If using a very recent pandoc version that introduced --split-level, please verify by running pandoc --help | grep split-level in the CI environment (Ubuntu apt pandoc). The pandoc version from apt on ubuntu-latest is often older and may not support newer flags.
| '--split-level=1', | |
| '--syntax-highlighting=tango', | |
| '--epub-chapter-level=1', | |
| '--highlight-style=tango', |
What this does
Adds automated EPUB and Word export from the docs/ folder.
Formats
How it works
scripts/build-epub.jspreprocesses all 44 docs (strips broken links, podcast callouts) then runs pandoc.github/workflows/build-epub.ymltriggers automatically wheneverdocs/**changes on mainTriggered by
mainthat changesdocs/**,epub/metadata.yaml,epub/epub.css, orscripts/build-epub.js