Transcribe podcasts and other audio from a URL or local file. Choose between local Whisper, AWS Transcribe, or Google Cloud Speech‑to‑Text. Export transcripts to text, subtitles, and e‑books.
Badges
- Backends:
--service whisper|aws|gcp(pluggable architecture). - Inputs: Local files, direct URLs, YouTube (via
yt-dlp), and podcast RSS feeds (first enclosure). - Outputs:
--format txt|pdf|epub|mobi|azw|azw3|srt|vtt|json|md.- Plus DOCX via optional extra:
docx.
- Plus DOCX via optional extra:
- Export details:
- PDF: headers/footers, optional cover page, auto‑TOC from segments, custom fonts and page size.
- EPUB/Kindle: built‑in themes or custom CSS, multi‑chapter from segments, optional cover.
- DOCX: simple manuscript export with optional cover page (install
[docx]). - Subtitles: SRT/VTT with timestamps and optional speaker labels.
- JSON: full transcript + segments + word‑level timings (when available).
- DOCX: simple manuscript export with optional cover page (install
- Advanced transcription:
- Speaker diarization:
--speakers Nfor AWS/GCP. - Whisper chunking:
--chunk-seconds Nfor long audio;--translatefor English translation. - GCP long‑running recognition:
--gcp-longrunning.
- Speaker diarization:
- Batch processing:
--input-file list.txtto process many items into a directory. - Caching and robustness: retry/backoff for downloads,
--cache-dirand--no-cachefor transcript caching. - Post‑processing:
--normalize(whitespace/paragraphs),--summarize N(naive summary).
- Python 3.9+
- Core dependency:
requests - Optional extras (installed only if you use the feature):
- Whisper:
openai-whisper,ffmpeg - AWS:
boto3+ AWS credentials; env varAWS_TRANSCRIBE_S3_BUCKET - GCP:
google-cloud-speech+ credentials (GOOGLE_APPLICATION_CREDENTIALS) - PDF:
fpdf2 - EPUB/Kindle:
ebooklib(and Calibre’sebook-convertfor Kindle formats) - YouTube:
yt-dlp - ID3 cover/title:
mutagen(optional)
- Whisper:
Install from PyPI (core only):
pip install podcast-transcriberInstall with extras (examples):
# Local Whisper backend (requires ffmpeg on PATH)
pip install "podcast-transcriber[whisper]"
# Export formats (PDF/EPUB/Kindle)
pip install "podcast-transcriber[export]"
# Orchestrator + ingestion + templates
pip install "podcast-transcriber[orchestrator,ingest,templates]"Extras quick reference:
| Feature | Extra | Install command | Notes |
|---|---|---|---|
| Whisper (local) | whisper |
pip install -e .[whisper] |
Requires ffmpeg on PATH |
| AWS Transcribe | aws |
pip install -e .[aws] |
Needs AWS creds + AWS_TRANSCRIBE_S3_BUCKET |
| GCP Speech-to-Text | gcp |
pip install -e .[gcp] |
Needs GOOGLE_APPLICATION_CREDENTIALS |
| Export formats (PDF/EPUB/Kindle) | export |
pip install -e .[export] |
Kindle formats require Calibre ebook-convert |
| Developer tools | dev |
pip install -e .[dev] |
Includes pytest, etc. |
| Docs | docs |
pip install -e .[docs] |
MkDocs + Material |
Install from source (editable) for development:
python -m venv .venv
source .venv/bin/activate
pip install -e .[dev]Optional extras examples:
pip install -e .[whisper]
pip install -e .[aws]
pip install -e .[gcp]
pip install -e .[export]- Formatter: Ruff (via
make fmt/make fmt-check). - Linter: Ruff (via
make lint/make lint-fix). - Optional: Black config exists for local use, but CI and Make targets use Ruff.
Build a minimal image (choose extras via build-arg). By default, we include useful runtime extras: export,templates,ingest,orchestrator,env. For Whisper (heavy), add whisper explicitly.
# Base features (PDF/EPUB/templates/orchestrator/ingest):
docker build -t podcast-transcriber:latest \
--build-arg PIP_EXTRAS=export,templates,ingest,orchestrator,env .
# Include Whisper (requires ffmpeg; already installed in the image):
docker build -t podcast-transcriber:whisper \
--build-arg PIP_EXTRAS=export,templates,ingest,orchestrator,env,whisper .Run the CLI (mount output directory):
mkdir -p ./out
docker run --rm \
-v "$(pwd)/out:/out" \
podcast-transcriber:latest \
--url "https://example.com/audio.mp3" \
--service aws \
--format txt \
--output /out/transcript.txtRun the Orchestrator (override entrypoint with --entrypoint):
# config.yml should be in your current directory
docker run --rm \
--entrypoint podcast-cli \
-v "$(pwd)/config.yml:/config.yml:ro" \
-v "$(pwd)/out:/out" \
-e AWS_TRANSCRIBE_S3_BUCKET="$AWS_TRANSCRIBE_S3_BUCKET" \
-e GOOGLE_APPLICATION_CREDENTIALS="/secrets/gcp.json" \
podcast-transcriber:latest \
run --config /config.yml
Unicode PDF note
- Core PDF fonts (e.g., Helvetica) do not support full Unicode. To render non‑ASCII characters, embed a Unicode font via `--pdf-font-file` (CLI) or `pdf_font_file` (YAML outputs).
- Our Docker images install DejaVu fonts. Recommended path: `/usr/share/fonts/truetype/dejavu/DejaVuSans.ttf`.
- Example (CLI): `--pdf-font-file /usr/share/fonts/truetype/dejavu/DejaVuSans.ttf`
## End-to-End Recipes (Oxford)
This repo includes ready-to-run recipes to exercise the full pipeline via Docker. They fetch a Creative Commons podcast RSS feed, transcribe the latest episodes, and produce multiple output formats.
- Recipes:
- `examples/recipes/oxford_quick.yml`: fastest profile for PR/CI (small Whisper model, `clip_minutes: 1`, all outputs).
- `examples/recipes/oxford_cc.yml`: standard profile (balanced quality).
- `examples/recipes/oxford_premium.yml`: highest quality (slowest).
- Run with Docker (Calibre image recommended to enable Kindle formats):
- Build (optional, the script can build for you):
- `docker build -f Dockerfile.calibre -t podcast-transcriber:calibre .`
- Orchestrator E2E (pick a recipe and limit N episodes):
- `./scripts/e2e_docker.sh -c examples/recipes/oxford_quick.yml -n 2 --fresh-state --dockerfile Dockerfile.calibre --image podcast-transcriber:calibre`
- Artifacts end up in `./out/`.
- What the script does:
- Ingests the feed(s), creates a job id, trims to the latest N episodes.
- Processes via orchestrator (`podcast-cli process`) and writes outputs per `outputs:` block in the YAML.
- Uses a local cache `./.e2e-cache -> /root/.cache` to reuse Whisper model downloads.
- `--fresh-state` deletes only the orchestrator state for deterministic runs; it does not clear the Whisper cache.
### Customizing a recipe
- Feeds: under `feeds:` provide one or more entries. You can use any RSS URL, PodcastIndex id/guid, or categories filter.
- By RSS URL:
- `feeds: [ { name: MyFeed, url: https://example.com/feed.xml } ]`
- By PodcastIndex (with env creds present):
- `feeds: [ { name: ById, podcastindex_feedid: "12345" } ]`
- Category filter (case-insensitive):
- `categories: ["creative commons", "technology"]`
- Quality presets:
- `quality: quick|standard|premium` (affects Whisper model and some defaults).
- Speed tip: `clip_minutes: 1` pre-clips audio before transcribing for faster runs.
- Outputs: choose formats and per-format options in the `outputs:` array.
- Common formats: `epub, pdf, docx, md, txt, json, srt, vtt, mobi, azw3` (Kindle uses Calibre).
- EPUB:
- `epub_css_text:` or `epub_css_file:` to embed CSS.
- PDF:
- `pdf_font_file:` set a Unicode TTF (e.g., `/usr/share/fonts/truetype/dejavu/DejaVuSans.ttf` in Docker).
- `pdf_cover_fullpage: true` for a full-page cover before the transcript.
- `pdf_first_page_cover_only: true` to start text on a new page after the cover.
- DOCX:
- `docx_cover_first: true` to place cover first.
- `docx_cover_width_inches: 6.0` to control cover width.
- Markdown:
- `md_include_cover: true` to place cover image at the top and save the image alongside the `.md` file.
- Cover & metadata:
- Orchestrator tries to fetch the episode’s `itunes:image` as cover. You can override with `cover_image: /path/to/file.jpg`.
- Common metadata can be set at the top-level (e.g., `author`, `language`), and passed into exports.
### Testing with your own RSS feed
- Duplicate a recipe (e.g., copy `examples/recipes/oxford_cc.yml` to `my_feed.yml`).
- Update:
- `feeds: - name: MyFeed, url: https://my/podcast.rss`
- Optionally `categories: [...]` to filter entries.
- `quality:` to suit your needs.
- `clip_minutes:` for quicker tests.
- `outputs:` to the list of formats you want to verify.
- Run:
- `./scripts/e2e_docker.sh -c my_feed.yml -n 2 --fresh-state --dockerfile Dockerfile.calibre --image podcast-transcriber:calibre`
### Running without the script
- Direct orchestrator run from Docker (YAML config inside the container):
- `docker run --rm --entrypoint podcast-cli -v "$(pwd)":/workspace -w /workspace podcast-transcriber:calibre ingest --config /workspace/examples/recipes/oxford_cc.yml`
- Then process:
- `docker run --rm --entrypoint podcast-cli -v "$(pwd)":/workspace -w /workspace podcast-transcriber:calibre process --job-id <id>`
- Direct from host (after installing extras):
- `pip install -e .[orchestrator,ingest,templates,export,docx,whisper]`
- `podcast-cli ingest --config examples/recipes/oxford_cc.yml`
- `podcast-cli process --job-id <id> [--clip-minutes N]`
Notes
- Kindle conversion (MOBI/AZW3) requires Calibre’s `ebook-convert`; use `Dockerfile.calibre` image or install Calibre locally.
- KFX is not included in distro Calibre; AZW3 is the recommended modern Kindle format.
- If you hit state “No new episodes discovered”, pass `--fresh-state` to the script (or remove state at `$PODCAST_STATE_DIR`).
Notes
- Provide cloud credentials via environment variables (
AWS_*,GOOGLE_APPLICATION_CREDENTIALS, SMTP vars) or mount secrets files. - Whisper adds significant image size; only include it if needed.
- Kindle conversions (azw/azw3/kfx) require Calibre
ebook-convert, which is not installed in the image.
Use compose.yaml to build and run the image locally.
# Build (choose extras via PIP_EXTRAS; add ",whisper" if needed)
PIP_EXTRAS=export,templates,ingest,orchestrator,env docker compose build
# Prepare config and output
cp examples/config.example.yml ./config.yml # or your own config
mkdir -p out secrets
# Optional: put GCP creds in ./secrets/gcp.json and export email/cloud envs
export AWS_TRANSCRIBE_S3_BUCKET=... \
KINDLE_TO_EMAIL=... \
KINDLE_FROM_EMAIL=... \
SMTP_HOST=... SMTP_PORT=587 SMTP_USER=... SMTP_PASS=...
# Run orchestrator pipeline
docker compose up orchestrator
# See output in ./outCompose services
transcriber:podcast-transcriberCLI (default--help).orchestrator:podcast-clirun --config /config/config.ymlwith volumes mounted for/config,/out, and/secrets.
- Copy the example file and fill in values as needed:
cp .env.example .env
# edit .env and set SMTP_*, KINDLE_*, and optional PodcastIndex/API keys- The orchestrator automatically loads
.envifpython-dotenvis installed (pip install -e .[env]). Never commit a real.env— the repo ignores.envby default.
Run via Bash wrapper from source (no package install of this project required):
Note: You still need Python dependencies available in your environment. At minimum, core runs require requests. For Whisper/AWS/GCP backends or exports, install the corresponding extras. See Installation below.
./Transcribe_podcast_to_text.sh --url "https://example.com/audio.mp3" --service whisper --output out.txtRun via Python module or console entrypoint (requires installing the package and its deps):
python -m podcast_transcriber --url <URL|path> --service <whisper|aws|gcp> --output out.txt
# after install
podcast-transcriber --url <URL|path> --service <whisper|aws|gcp> --output out.txtHigh‑level pipeline for “ingest → process → send to Kindle” and weekly digests.
- Install extras:
pip install -e .[orchestrator,ingest,templates](and optionally[scheduler,nlp]).
Subcommands:
podcast-cli ingest --config config.yml— Discover new episodes and create a job.podcast-cli process --job-id <id>— Transcribe and build EPUB for a job.- Ad‑hoc semantic segmentation: add
--semanticto this command to override YAML. - Speed up test runs: add
--clip-minutes Nto limit transcription to the first N minutes (pre-clips audio).
- Ad‑hoc semantic segmentation: add
podcast-cli send --job-id <id>— Email EPUBs to your Kindle address.podcast-cli run --config config.yml— Run ingest → process → send in one go.podcast-cli digest --feed <name> --weekly— Build a weekly digest EPUB.
Config (YAML) example:
feeds:
- name: myfeed
url: https://example.com/podcast.rss
- name: altfeed-by-id
podcastindex_feedid: "123456"
- name: altfeed-by-guid
podcast_guid: "urn:uuid:aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee"
service: whisper
quality: standard # quick|standard|premium
language: sv-SE
author: Your Name
output_dir: ./out
clip_minutes: 1 # optional: clip audio to N minutes before transcribing (faster E2E)
kindle:
to_email: [email protected]
from_email: [email protected]
smtp:
host: smtp.example.com
port: 587
user: smtp-user
# password set via env only, e.g. SMTP_PASS
# NLP options (optional)
nlp:
semantic: true # enable semantic topic segmentation (requires [nlp] extra)
takeaways: true # add a simple "Key takeaways" section
# Markdown output (optional)
emit_markdown: true
markdown_template: ./path/to/ebook.md.j2 # omit to use built-in template
Templating and themes:
- The built-in template defines blocks you can override: `front_matter`, `title_page`, `preface`, `content`, and `appendix`.
- Create your own Jinja2 theme that starts with {% raw %}`{% extends 'ebook.md.j2' %}`{% endraw %} and overrides the blocks you need.
- An example template is provided at `examples/templates/ebook_theme_minimal.md.j2`.
Topics and takeaways in Markdown:
- When NLP is enabled (`nlp.semantic: true` and/or `podcast-cli process --semantic`), the Markdown includes a "Topics" section listing chapter titles derived from segmentation.
- When `nlp.takeaways: true`, the Markdown also includes a "Key Takeaways" section with 3–5 concise bullets. If spaCy is installed, noun chunks are used; otherwise a heuristic is applied.Secrets policy: Store SMTP password and API keys in environment variables (e.g. SMTP_PASS, cloud provider keys). Ensure your Kindle address whitelists your sender.
Scheduling (optional):
- Install:
pip install -e .[scheduler,orchestrator,ingest] - Run once:
podcast-auto-run --config config.yml --once - Run hourly/daily:
podcast-auto-run --config config.yml --interval hourly|daily
Topic segmentation (optional):
- Install:
pip install -e .[nlp] - The CLI uses a simple fallback if embeddings are unavailable; with embeddings, segments are formed by semantic similarity dips and “key takeaways” are extracted heuristically.
Bilingual EPUB (premium idea):
- Set
bilingual: truein config to attempt “Original” + “Translated” sections when using Whisper (translation is toggled internally). If translation fails, it falls back to original only.
Quality presets
- quick: Uses a small Whisper model for fastest runs; ideal for CI smoke tests.
- standard: Default balance of speed/quality; enables simple summarization and 10‑minute chapters.
- premium: Largest Whisper model and richer processing (e.g., optional diarization/topic segmentation) for highest quality.
Usage
- Orchestrator YAML: set
quality: quick|standard|premium. For fast iterations also addclip_minutes: Nto limit transcription length. - Orchestrator CLI:
podcast-cli process --job-id ... --clip-minutes Noverrides YAML once. - CI: use
examples/recipes/oxford_quick.yml(fast), locally useexamples/recipes/oxford_cc.yml(standard) orexamples/recipes/oxford_premium.yml.
Required
--url: URL, local file, YouTube link, or RSS feed.--service:whisper,aws, orgcp.
Input and batch
--input-file list.txt: Process many items (one per line). Requires--outputto be a directory.--config config.toml: Provide defaults (e.g.,language,format,title, etc.). If omitted, a config is auto-discovered at~/.config/podcast-transcriber/config.toml(or$XDG_CONFIG_HOME/podcast-transcriber/config.toml).
Output and formats
--output: Output path (or directory for batch); defaults to stdout fortxt.--format:txt,pdf,epub,mobi,azw,azw3,srt,vtt,json,md.--title,--author: Document metadata.
Interactive mode
--interactive: Guided prompts for--url,--service,--format,--output, and optional--language. Great for first-time users.
Whisper options
--whisper-model base|small|medium|large--chunk-seconds N: Split long audio into chunks.--translate: Whisper translate task (to English).--language: Hint language code (e.g.,sv,en-US).- Whisper notes: BCP‑47 tags like
en-USare normalized to primary codes (e.g.,en). If a provided code is unsupported by Whisper, the service falls back to auto‑detect.
- Whisper notes: BCP‑47 tags like
AWS options
--aws-bucket,--aws-region--auto-languageand--aws-language-options sv-SE,en-US--speakers N: Enable speaker labels.--aws-keep: Keep uploaded S3 object after job completes.
GCP options
--gcp-alt-languages en-US,nb-NO--speakers N: Enable diarization.--gcp-longrunning: Use long running recognition for long audio.
PDF/EPUB options
- PDF:
--pdf-page-size A4|Letter,--pdf-orientation portrait|landscape,--pdf-margin <mm>,--pdf-font Arial,--pdf-font-size 12,--pdf-font-file path.ttf,--pdf-cover-fullpage,--pdf-first-page-cover-only.- Unicode: Set
--pdf-font-fileto a Unicode TTF/OTF (e.g.,/usr/share/fonts/truetype/dejavu/DejaVuSans.ttfin our Docker images) for full character coverage. - Cover:
--pdf-cover-fullpagefor a full-page cover;--pdf-first-page-cover-onlyto start the transcript on the next page.
- Unicode: Set
- EPUB/Kindle:
--epub-css-file style.css,--epub-theme minimal|reader|classic|darkorcustom:/path.css,--cover-image cover.jpg,--auto-toc(creates a simple TOC from segments; PDF also adds header/footer based on title/author).
DOCX/Markdown options (via orchestrator outputs)
- DOCX:
docx_cover_first: true(place cover first),docx_cover_width_inches: 6.0(control cover width). - Markdown:
md_include_cover: trueplaces the cover at the top and saves the image next to the.mdfile.
Caching and logging
--cache-dir /path/to/cache,--no-cache--verbose,--quiet
Post‑processing
--normalize: Normalize whitespace/paragraphs--summarize N: Naive summary (first N sentences)
Local Whisper to TXT
./Transcribe_podcast_to_text.sh \
--url "https://example.com/podcast.mp3" \
--service whisper \
--output transcript.txtAWS with language auto‑detect restricted to Swedish or English (US)
export AWS_TRANSCRIBE_S3_BUCKET=my-bucket
./Transcribe_podcast_to_text.sh \
--url "./examples/tone.wav" \
--service aws \
--auto-language \
--aws-language-options sv-SE,en-US \
--aws-region eu-north-1 \
--output transcript.txtGCP with alternative languages
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/creds.json
./Transcribe_podcast_to_text.sh \
--url "./examples/tone.wav" \
--service gcp \
--language sv-SE \
--gcp-alt-languages en-US,nb-NO \
--output transcript.txtSRT/VTT with speaker labels (AWS)
./Transcribe_podcast_to_text.sh \
--url ./episode.wav \
--service aws \
--speakers 2 \
--format srt \
--output episode.srtWhisper chunked VTT for a long file
./Transcribe_podcast_to_text.sh \
--url ./long.mp3 \
--service whisper \
--chunk-seconds 600 \
--format vtt \
--output long.vttEPUB with theme and auto TOC
./Transcribe_podcast_to_text.sh \
--url ./episode.mp3 \
--service whisper \
--format epub \
--epub-theme reader \
--auto-toc \
--output episode.epubBatch processing to a directory
cat > list.txt <<EOF
https://example.com/ep1.mp3
https://example.com/ep2.mp3
EOF
./Transcribe_podcast_to_text.sh \
--service whisper \
--input-file list.txt \
--format md \
--output ./out_dirKDP pipeline (EPUB) for a single episode
./Transcribe_podcast_to_text.sh \
--url ./episode.mp3 \
--service whisper \
--kdp \
--title "Podcast: Season 1 – Episode 1" \
--author "Ditt Namn" \
--description "En transkriberad version av avsnittet..." \
--keywords "podcast, svensk, teknik" \
--cover-image ./cover.jpg \
--output ./episode.epubKDP book from multiple episodes (combine into one EPUB)
cat > episodes.txt <<EOF
https://example.com/ep1.mp3
https://example.com/ep2.mp3
EOF
./Transcribe_podcast_to_text.sh \
--service whisper \
--input-file episodes.txt \
--combine-into ./podcast-book.epub \
--kdp \
--title "Min Podcast – Volym 1" \
--author "Ditt Namn" \
--description "Transcriptions of the best episodes of the season" \
--keywords "podcast, swedish, society"DOCX manuscript (requires extra)
pip install -e .[docx]
./Transcribe_podcast_to_text.sh \
--url ./episode.mp3 \
--service whisper \
--format docx \
--title "Avsnitt 1" \
--author "Ditt Namn" \
--output ./episode.docx- Kindle formats (
mobi|azw|azw3|kfx) require Calibre’sebook-converton PATH. - YouTube extraction requires
yt-dlp; otherwise HTTP fallback is used. - ID3 metadata (title/cover) is read when
mutagenis installed; RSS feeds use the first<enclosure>URL. - AWS/GCP calls are not made during tests; unit tests mock external services.
- Example plugin:
examples/plugin_echo/registers anechoservice via entry points. Install withpip install -e examples/plugin_echoand use--service echo. - Smoke test script:
scripts/smoke.shautomates a basic run including plugin discovery and JSON export. Make it executable and run:
chmod +x scripts/smoke.sh
./scripts/smoke.sh
When using --format json, the file includes additional metadata when available from the downloader (ID3, yt-dlp, etc.).
- Keys:
title: Document title.author: Document author (if provided).text: Full transcript.segments: List of coalesced segments withstart,end,text, and optionalspeaker.words: Optional word-level timings when the backend provides them.source: Optional object with downloader metadata, for example:source_url: Original URL.local_path: Local file path used for transcription.id3_title,id3_artist: From ID3 tags if present.source_title: From yt-dlp (e.g., video title).source_uploader: From yt-dlp (e.g., channel/uploader).cover_url: Thumbnail URL when available.
You can ship third-party services as plugins via Python entry points. Register the entry point group podcast_transcriber.services in your package and expose either a subclass of TranscriptionService or a zero-argument factory that returns one.
pyproject.toml (in your plugin):
[project.entry-points."podcast_transcriber.services"]
myservice = "my_package.my_module:MyService"
Your service must implement the TranscriptionService interface (see src/podcast_transcriber/services/base.py). Once installed, it appears in --service choices and in --interactive selection.
- Troubleshooting: see
docs/troubleshooting.mdfor common issues and fixes.
- ffmpeg (Whisper): install via Homebrew (
brew install ffmpeg) or apt (sudo apt-get install -y ffmpeg). - ebook-convert (Kindle): install Calibre and ensure
ebook-convertis on PATH (macOS:brew install --cask calibre). - yt-dlp (YouTube):
pipx install yt-dlporpip install yt-dlpand ensure it’s on PATH. - mutagen (ID3 title/cover):
pip install mutagento auto‑read MP3 metadata. - Credentials (AWS/GCP):
pip install boto3and setAWS_TRANSCRIBE_S3_BUCKET;pip install google-cloud-speechand setGOOGLE_APPLICATION_CREDENTIALS.
- Run tests with
pytest(external calls are mocked):pytest -q
- Layout:
src/podcast_transcriber/– core logic and servicestests/– unit tests with mocksdocs/– MkDocs documentationexamples/–generate_tone.pycreates a tiny WAV demo
- GitHub Actions runs
pyteston push/PR (matrix across Python versions and optional extras). - MkDocs builds and publishes docs to GitHub Pages (see
.github/workflows/docs.yml). - Test coverage: CI currently green at 85% (local ~87%). Generate locally with
make coverage(XML + terminal) ormake coverage-html(HTML inhtmlcov/). CI enforces a minimum via--cov-fail-under.
Developed by Johan Caripson.
MIT (see LICENSE)
Whisper (local):
./Transcribe_podcast_to_text.sh \
--url "https://example.com/podcast.mp3" \
--service whisper \
--output transcript.txtAWS with language auto‑detect restricted to Swedish or English (US):
export AWS_TRANSCRIBE_S3_BUCKET=my-bucket
./Transcribe_podcast_to_text.sh \
--url "./examples/tone.wav" \
--service aws \
--auto-language \
--aws-language-options sv-SE,en-US \
--aws-region eu-north-1 \
--output transcript.txtGCP with alternative languages:
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/creds.json
./Transcribe_podcast_to_text.sh \
--url "./examples/tone.wav" \
--service gcp \
--language sv-SE \
--gcp-alt-languages en-US,nb-NO \
--output transcript.txtKindle formats note: --format mobi|azw|azw3|kfx requires Calibre’s ebook-convert on PATH.
Utility
--credits: Print maintainer credits and exit.