Skip to content

Add EPUB and Word (.docx) export pipeline#172

Merged
taylorarndt merged 3 commits intomainfrom
epub-typesetting
Mar 8, 2026
Merged

Add EPUB and Word (.docx) export pipeline#172
taylorarndt merged 3 commits intomainfrom
epub-typesetting

Conversation

@taylorarndt
Copy link
Copy Markdown
Member

What this does

Adds automated EPUB and Word export from the docs/ folder.

Formats

  • EPUB — for Kindle, Apple Books, Kobo, any e-reader
  • Word (.docx) — for screen reader users on Windows, printing, or editing

How it works

  • scripts/build-epub.js preprocesses all 44 docs (strips broken links, podcast callouts) then runs pandoc
  • .github/workflows/build-epub.yml triggers automatically whenever docs/** changes on main
  • Both files are committed back to the repo and uploaded as 90-day artifacts

Triggered by

  • Any push to main that changes docs/**, epub/metadata.yaml, epub/epub.css, or scripts/build-epub.js
  • Manual via workflow_dispatch

taylorarndt and others added 3 commits March 7, 2026 23:02
- scripts/build-epub.js — builds a single EPUB from all 44 docs/ files
- epub/metadata.yaml — title, author, language, TOC depth, CC BY 4.0 rights
- epub/epub.css — clean typesetting (serif body, sans headings, code blocks, tables)
- .github/workflows/build-epub.yml — auto-builds on docs/ changes, uploads as artifact
- npm run build:epub script added to package.json

EPUB is generated locally but not committed (in .gitignore).
The workflow artifact is retained for 90 days.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Remove WeasyPrint/PDF (dependency issues)
- Add Word .docx output using pandoc --to docx
- Both formats built from same preprocessed markdown
- CI workflow updated: uploads EPUB and Word artifacts (90-day)
- .gitignore updated for .docx
- pdf.css kept for reference but not used in build

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 8, 2026 05:25
@taylorarndt taylorarndt requested a review from accesswatch as a code owner March 8, 2026 05:25
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 8, 2026

Peer Review Assigned

Hi @taylorarndt! Your PR has been automatically paired with @accesswatch for peer review.

For @accesswatch:

This is a great opportunity to practice code review skills! Here's what to look for:

Content Quality:

  • Does the change accomplish what the issue describes?
  • Is the writing clear and helpful?
  • Are there any typos or grammar issues?

Accessibility:

  • Proper heading hierarchy (H1 → H2 → H3, no skips)?
  • Descriptive link text (not "click here")?
  • Alt text on images?
  • [TODO] markers removed?

Documentation:

  • Code blocks are properly formatted?
  • Tables have headers?
  • References/links work correctly?

Review Guidelines:

  • Be kind and constructive
  • Suggest improvements, don't just point out problems
  • Ask questions if something is unclear
  • Approve when ready or request changes with explanation

Resources:


Pairing by Learning Room Grouping Engine

@taylorarndt taylorarndt closed this Mar 8, 2026
@taylorarndt taylorarndt merged commit 2126847 into main Mar 8, 2026
4 of 5 checks passed
@taylorarndt taylorarndt deleted the epub-typesetting branch March 8, 2026 05:28
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds an automated EPUB and Word (.docx) export pipeline for the docs/ folder, enabling accessible reading on e-readers, screen readers, and for printing. The build script preprocesses 44+ markdown files (stripping podcast callouts, rewriting internal links) and runs pandoc to generate both formats. A GitHub Actions workflow automates the build on docs changes.

Changes:

  • New scripts/build-epub.js build script that preprocesses markdown files and invokes pandoc for EPUB and Word output
  • New .github/workflows/build-epub.yml workflow that triggers on docs changes, builds both formats, uploads artifacts, and commits outputs
  • New epub/ directory with metadata, EPUB CSS, and PDF CSS files; updated .gitignore and package.json

Reviewed changes

Copilot reviewed 6 out of 7 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
scripts/build-epub.js Core build script: reads docs, preprocesses markdown, invokes pandoc for EPUB and DOCX
.github/workflows/build-epub.yml CI workflow: triggers on docs changes, installs pandoc, runs build, uploads artifacts, commits outputs
epub/metadata.yaml Pandoc metadata for the EPUB/DOCX (title, author, TOC settings)
epub/epub.css Stylesheet for the EPUB output
epub/pdf.css Stylesheet for future PDF output (not yet used by build script)
.gitignore Ignores generated .epub and .docx output files
package.json Adds build:epub npm script

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +69 to +70
// Pattern: > **Listen to Episode N:** ... line(s)
content = content.replace(/^>[ \t]*\*\*Listen to Episode[^\n]*\n/gm, '');
Copy link

Copilot AI Mar 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The podcast callout regex only removes the > **Listen to Episode... line, but the actual callouts in the docs are preceded by a standalone > line (e.g., docs/00-pre-workshop-setup.md line 2). This leaves an orphaned empty blockquote marker > in the preprocessed output, which will render as an empty blockquote element in the EPUB/DOCX.

Consider expanding the pattern to also match and remove the preceding empty blockquote line, e.g. matching ^>\s*\n^>[ \t]*\*\*Listen to Episode[^\n]*\n.

Suggested change
// Pattern: > **Listen to Episode N:** ... line(s)
content = content.replace(/^>[ \t]*\*\*Listen to Episode[^\n]*\n/gm, '');
// Pattern:
// >
// > **Listen to Episode N:** ... line
content = content.replace(/^>\s*\n^>[ \t]*\*\*Listen to Episode[^\n]*\n/gm, '');

Copilot uses AI. Check for mistakes.
#!/usr/bin/env node
/**
* Build EPUB from docs/ markdown files using Pandoc.
* Outputs: epub/git-going-with-github.epub
Copy link

Copilot AI Mar 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The file-level doc comment says "Outputs: epub/git-going-with-github.epub" but the script also outputs epub/git-going-with-github.docx. The docstring should mention both outputs.

Suggested change
* Outputs: epub/git-going-with-github.epub
* Outputs: epub/git-going-with-github.epub, epub/git-going-with-github.docx

Copilot uses AI. Check for mistakes.
const size = (fs.statSync(EPUB_OUT).size / 1024).toFixed(1);
console.log(`\nEPUB written: epub/git-going-with-github.epub (${size} KB)`);
} catch (err) {
console.error('\nPandoc EPUB failed. Is pandoc installed? Run: brew install pandoc');
Copy link

Copilot AI Mar 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message suggests brew install pandoc as the install command, but the CI workflow runs on ubuntu-latest where apt-get install pandoc is the correct command. This could be confusing for contributors who encounter the error in CI. Consider making the suggestion platform-aware, or at minimum mentioning both brew install pandoc (macOS) and sudo apt-get install pandoc (Linux).

Copilot uses AI. Check for mistakes.
Comment on lines +51 to +61
- name: Commit outputs to repository
run: |
git config user.name "github-actions[bot]"
git config user.email "41898282+github-actions[bot]@users.noreply.github.com"
git add epub/git-going-with-github.epub epub/git-going-with-github.docx
if [ -n "$(git status --porcelain)" ]; then
git commit -m "chore: rebuild EPUB and Word from latest docs"
git push
else
echo "No changes to commit"
fi
Copy link

Copilot AI Mar 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The .gitignore file (lines 168-169) ignores epub/git-going-with-github.epub and epub/git-going-with-github.docx, but the workflow step at line 55 runs git add on those same files. Since they are gitignored, git add will silently skip them (they won't be staged), the porcelain check will find no changes, and the commit/push will never happen.

You need to either:

  1. Remove these entries from .gitignore so the files can be committed, or
  2. Use git add -f epub/git-going-with-github.epub epub/git-going-with-github.docx to force-add ignored files, or
  3. Remove the commit step entirely and rely solely on the artifact uploads.

Copilot uses AI. Check for mistakes.
const size = (fs.statSync(EPUB_OUT).size / 1024).toFixed(1);
console.log(`\nEPUB written: epub/git-going-with-github.epub (${size} KB)`);
} catch (err) {
console.error('\nPandoc EPUB failed. Is pandoc installed? Run: brew install pandoc');
Copy link

Copilot AI Mar 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the EPUB pandoc build fails, process.exit(1) on line 134 terminates the process immediately without cleaning up the temporary directory created at line 23. The fs.rmSync(TMP, ...) cleanup in the finally block (line 158-159) only covers the docx build.

Consider wrapping both builds in a single try/finally, or adding cleanup before process.exit(1) on line 134.

Suggested change
console.error('\nPandoc EPUB failed. Is pandoc installed? Run: brew install pandoc');
console.error('\nPandoc EPUB failed. Is pandoc installed? Run: brew install pandoc');
try {
fs.rmSync(TMP, { recursive: true, force: true });
} catch (cleanupErr) {
// ignore cleanup errors
}

Copilot uses AI. Check for mistakes.
Comment on lines +121 to +122
'--split-level=1',
'--syntax-highlighting=tango',
Copy link

Copilot AI Mar 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--syntax-highlighting=tango (line 122) and --split-level=1 (line 121) do not appear to be valid pandoc command-line options. The standard pandoc equivalents are --highlight-style=tango and --epub-chapter-level=1 respectively. Note that epub-chapter-level: 1 is already set in epub/metadata.yaml, so the command-line flag may be redundant once corrected.

If using a very recent pandoc version that introduced --split-level, please verify by running pandoc --help | grep split-level in the CI environment (Ubuntu apt pandoc). The pandoc version from apt on ubuntu-latest is often older and may not support newer flags.

Suggested change
'--split-level=1',
'--syntax-highlighting=tango',
'--epub-chapter-level=1',
'--highlight-style=tango',

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants