Podcast Audio Pipeline
This directory contains the complete pipeline for producing the Git Going with GitHub audio series: 54 companion episodes of two-host conversational content designed for blind and low-vision developers, plus 21 Challenge Coach episodes placed near the chapters they support.
Pipeline Overview
build-bundles.js Generate source bundles from chapter content
|
v
bundles/*.md One bundle per episode (production prompt + source material)
|
v
scripts/*.txt Conversational scripts with [ALEX]/[JAMIE]/[PAUSE] markers
|
v
transcripts/*-chapters.json Ordered chapter plans by segment index
|
v
tts/ Local neural TTS (ONNX models, pronunciation lexicon)
|
v
audio/*.(wav|mp3) Final episode files (gitignored - hosted on GitHub Releases)
|
v
generate-site.js Build PODCASTS.md player page and RSS feed from manifest
Audio is generated locally using ONNX text-to-speech models. No cloud APIs, no API keys, no billing. Runs entirely on your machine.
Use the unified audio command as the front door. It defaults to Kokoro, the production audio path. Piper remains available only when explicitly selected for fallback or comparison. See Piper Fallback TTS for setup, scope, and validation guidance.
python -m podcasts.tts.generate_audio --audio-format mp3
python -m podcasts.tts.generate_audio --engine kokoro --audio-format mp3
python -m podcasts.tts.generate_audio --engine piper --audio-format mp3
podcasts/config/listening-order.json controls the public listening path. It interleaves companion lessons, Challenge Coach episodes, and reference episodes so podcast apps and the generated player page present the workshop as one end-to-end experience. The audio generators and inventory tools use the same order, so generation queues match the learner path.
Final episode output format is configurable through podcasts/tts/voice-config.ini (episode_audio_format = wav|mp3|both). MP3 generation requires ffmpeg on your PATH.
After all MP3 files and segment manifests are generated, run the metadata pass to add ID3 tags, embed the source script, and derive smart chapter markers for each episode. The metadata pass defaults to a dry run so it can verify the full 75-file set before touching audio.
Directory Structure
The following tree separates source files, generated artifacts, local caches, and legacy helpers.
podcasts/
README.md This guide
REGENERATION.md Full regeneration runbook
config/listening-order.json Canonical public listening path
lib/listening-plan.js Shared JavaScript listening-order resolver
listening_plan.py Shared Python listening-order resolver
build-bundles.js Companion episode catalog and bundle generator
build-challenge-bundles.js Challenge Coach catalog and bundle generator
generate-draft-transcripts.js Reviewable Alex/Jamie script generator
generate-site.js Generates https://lp.csedesigns.com/ggg/PODCASTS.html and feed.xml
validate-catalog.js Validates source coverage
validate-listening-order.js Validates the complete public listening path
validate-feed.js Validates RSS structure and enclosures
verify_audio_inventory.py Validates scripts, transcripts, manifests, and MP3s
tag-audio-metadata.py Writes ID3 metadata and chapter markers
manifest.json Companion episode metadata
feed.xml Generated RSS feed
bundles/ Generated companion prompt packets
challenge-bundles/ Generated Challenge Coach prompt packets
scripts/ Committed transcript source scripts
transcripts/ Derived segment JSON transcripts and chapter plans
chapters/ Podcasting 2.0 chapter JSON sidecars written during metadata tagging
audio/ Local audio output and segment cache, not committed
logs/ Local generation and inventory reports
tools/agentic-pilot/ One-episode packet builder and transcript evaluation helpers
tools/legacy/ Older diagnostic and one-off helpers
tts/ Production and fallback TTS package
Prerequisites
- Python 3.10 or later
- Kokoro TTS:
pip install kokoro-onnx soundfile numpy - FFmpeg on your PATH for MP3 conversion
- Mutagen for MP3 ID3 metadata:
pip install mutagen - Node.js 18 or later (for bundle/site generation only)
For full regeneration guidance after curriculum changes, see Podcast Regeneration Runbook.
Quick Start
1. Install Kokoro and metadata tooling
pip install kokoro-onnx soundfile numpy
pip install mutagen
2. Download voice models (if not already present)
python -m podcasts.tts.download_kokoro_samples --english-high-quality-only
This downloads the Kokoro ONNX model and voices file to podcasts/tts/models/.
3. Validate the catalog and generate fresh local bundles
npm run validate:podcasts
npm run build:podcast-bundles
npm run build:podcast-challenge-bundles
npm run generate:podcast-transcripts
podcasts/bundles/*.md files are generated prompt packets. They are intentionally ignored by git and should be regenerated when needed.
podcasts/challenge-bundles/*.md files are the same kind of generated prompt packet, but scoped to individual Challenge Coach episodes.
Transcript generation now writes three artifacts for each episode or challenge:
- the transcript source in
podcasts/scripts/ - the segment manifest in
podcasts/transcripts/*-segments.json - the ordered chapter plan in
podcasts/transcripts/*-chapters.json
The chapter plan is sequential, not time-based. It stores chapter titles with segment indexes. Later, the metadata pass converts those ordered boundaries into timed ID3 chapter markers and Podcasting 2.0 chapter sidecars after audio generation.
You can also regenerate a subset instead of rebuilding all 75 scripts:
npm run generate:podcast-transcript -- --slug ep05-working-with-issues
npm run generate:podcast-transcript -- --start 1 --end 4 --group challenges
npm run generate:podcast-transcript -- --start 20 --end 25 --group appendices
4. Generate all episodes
Preview the listening-order generation queue without loading models or creating audio:
npm run podcast:audio:queue
Then generate audio when you are ready:
python -m podcasts.tts.generate_audio --audio-format mp3
This batch command processes the full committed script set: all ep*.txt companion episodes plus all cc-*.txt Challenge Coach and bonus episodes.
Or generate a single episode:
python -m podcasts.tts.generate_audio --start 0 --end 0 --force --audio-format mp3
python -m podcasts.tts.generate_audio --start 5 --end 5 --force --audio-format mp3
Or a range:
python -m podcasts.tts.generate_audio --start 0 --end 10 --audio-format mp3
5. Build player page and RSS feed
npm run build:podcast-site
Voice Configuration
The default voices are:
| Host | Kokoro Voice | Character | Description |
|---|---|---|---|
| Alex | am_liam | Lead host, experienced, warm | Male, polished delivery with stronger presence |
| Jamie | af_jessica | Co-host, curious, energetic | Female, clear and natural delivery |
Listen to samples in podcasts/tts/samples/ to try different voices.
To change Kokoro voices, pass --male-voice and --female-voice through the unified command: python -m podcasts.tts.generate_audio --engine kokoro --male-voice am_liam --female-voice af_jessica.
If delivery feels too slow or too fast, tune Kokoro pacing directly:
python -m podcasts.tts.generate_audio --engine kokoro --speech-speed 1.08 --pause-seconds 1.0 --inter-segment-seconds 0.18 --inter-speaker-seconds 0.28
Pitch can be configured independently per host in podcasts/tts/voice-config.ini:
male_pitch_semitones = -1.0
female_pitch_semitones = 0.8
For backwards compatibility, pitch_semitones is still accepted and applies the same shift to both voices.
Pronunciation Lexicon
The file podcasts/tts/lexicon.txt contains pronunciation overrides for technical terms, acronyms, and jargon. The lexicon is applied as text substitution before Kokoro synthesizes each segment.
Format: one entry per line, tab-separated WORD<tab>REPLACEMENT. Lines starting with # are comments.
Example entries:
WCAG W-Cag
NVDA N V D A
GitHub Git Hub
JSON Jason
Add new entries when Kokoro mispronounces a word. The lexicon is loaded once per run and uses word-boundary matching so entries like GUI do not affect words like "guidelines".
Manifest Status Flow
Each episode in manifest.json progresses through these statuses:
bundle-ready --> script-ready --> audio-ready --> published
(build-bundles) (scripts/) (tts/) (GitHub Release)
All npm Scripts
The following table lists the supported podcast build and validation commands.
| Command | What It Does |
|---|---|
npm run validate:podcasts |
Validate episode catalog source mappings and the complete listening order |
npm run build:podcast-bundles |
Generate source bundles from chapters |
npm run build:podcast-challenge-bundles |
Generate source bundles for Challenge Coach episodes |
npm run generate:podcast-transcripts |
Replace old scripts with fresh reviewable Alex/Jamie draft transcripts |
npm run generate:podcast-transcript -- --slug <slug> |
Regenerate one selected transcript or a filtered range using --start, --end, and --group |
npm run podcast:agentic:packet -- --slug <slug> |
Build a single episode source packet for rewrite and review workflows |
npm run podcast:agentic:stage -- --slug <slug> --input <file> |
Save a retryable candidate rewrite under logs/agentic-pilots/candidates/<slug>/attempt-###.txt |
npm run podcast:agentic:catalog |
Run full-catalog coverage/style/repetition evaluation with per-episode reports |
npm run podcast:agentic:promote -- --slug <slug> |
Promote an accepted candidate transcript into the live script path only when all gates pass |
npm run podcast:chapters:normalize |
Normalize generated chapter-plan sidecars to remove weak or overly generic titles |
npm run podcast:chapters:audit |
Audit all generated chapter-plan sidecars and report title quality across the full catalog |
npm run build:podcast-transcripts |
Run validation, regenerate bundles, regenerate transcripts, and rebuild podcast page/feed |
npm run build:podcast-audio |
Generate MP3 audio for all companion, Challenge Coach, and bonus scripts with local Kokoro TTS |
npm run build:podcast-audio:piper |
Generate audio with the legacy local Piper TTS path |
npm run build:podcast-audio:kokoro |
Generate MP3 audio with the Kokoro TTS path |
npm run podcast:audio:queue |
Print the listening-order audio generation queue without creating MP3 files |
npm run build:podcast-transcripts-and-audio |
Run full transcript pipeline, generate audio, and rebuild podcast page/feed |
npm run build:podcast-site |
Build player page and RSS feed |
npm run podcast:metadata:check |
Dry-run validation that all 75 MP3s and matching scripts are present before tagging |
npm run podcast:metadata:write |
Write ID3 metadata, embed episode scripts and smart chapters, write chapter JSON sidecars, and touch all 75 MP3 files |
npm run build:podcasts |
Bundles + site |
npm run build |
Full build: podcasts + HTML site |
Publishing Audio
Audio files are hosted on GitHub Releases (not in the repository, they are gitignored).
- Generate all audio as MP3 files:
npm run build:podcast-audio - Confirm all expected MP3 files exist:
npm run podcast:metadata:check - Write ID3 tags, embed the source script, derive smart chapter markers, and touch each MP3:
npm run podcast:metadata:write - Build the podcast page and RSS feed:
npm run build:podcast-site - Create a GitHub Release tagged
podcasts - Upload the MP3 files from
podcasts/audio/or the selected voice output folder as release assets - The RSS feed points to release asset URLs, links chapter JSON sidecars, and embeds clean script text in each item
- Update manifest status to
publishedand rebuild the site
Updating Episodes
When chapter content changes:
npm run validate:podcaststo catch missing source mappings and coverage gapsnpm run build:podcast-bundlesto regenerate local chapter and appendix bundlesnpm run build:podcast-challenge-bundlesto regenerate local challenge bundlesnpm run generate:podcast-transcriptsto replace old scripts with fresh reviewable drafts- Review and edit the scripts in
podcasts/scripts/ python -m podcasts.tts.generate_all_kokoro --start <number> --end <number> --force --audio-format mp3to regenerate audionpm run podcast:metadata:checkto verify the complete MP3 set before taggingnpm run podcast:metadata:writeto refresh ID3 tags, smart chapters, chapter JSON, and touch every generated MP3npm run build:podcast-siteto update the player page and RSS feed- Upload new audio to the GitHub Release
- Commit reviewed scripts, metadata, generated site/feed files, and source changes
MP3 Metadata, Embedded Scripts, and Chapters
The metadata tool writes the following ID3 fields to each MP3:
- Title: episode or challenge title
- Artist, album artist, and publisher: Community Access
- Album: Git Going with GitHub - Audio Series
- Author website: Community Access website
- Description: episode description or challenge focus
- Episode script: the matching
podcasts/scripts/*.txtsource embedded as both a custom text frame and an unsynchronized lyrics frame - Smart chapters: ID3 chapter frames derived from
podcasts/audio/segments/<episode>/manifest.json, preferring transcript-authored chapter plans frompodcasts/transcripts/*-chapters.jsonwhen available - Chapter sidecars: Podcasting 2.0 JSON files in
podcasts/chapters/, linked from RSS aspodcast:chapters
Chapter markers now have a two-stage flow:
- Transcript generation writes ordered chapter plans using segment indexes, while the lesson structure is still available.
- Metadata tagging converts those segment indexes into timed chapter markers after audio generation.
If no transcript-authored chapter plan exists, the metadata tool falls back to the older pause-aware heuristic. That fallback starts with the opening segment, prefers natural boundaries after [PAUSE], avoids very short chapters, and forces a new marker when a section grows too long.
For one-episode agentic pilot work (automatic model selection), see podcasts/tools/agentic-pilot/README.md.
For a full-catalog refresh, the recommended sequence is:
npm run generate:podcast-transcripts
npm run podcast:chapters:normalize
npm run podcast:chapters:audit
Run the dry-run check first:
npm run podcast:metadata:check
Only after all 75 MP3 files and segment manifests exist, write tags, chapters, and refresh file modification times:
npm run podcast:metadata:write
If you are testing a partial batch intentionally, call the tool directly with --allow-missing and an explicit audio directory. Do not use that option for the final publishing pass.
Troubleshooting
Kokoro dependencies missing
Ensure Kokoro and audio dependencies are installed:
pip install kokoro-onnx soundfile numpy
Model not found
Download models first:
python -m podcasts.tts.download_kokoro_samples --english-high-quality-only
The legacy Piper model downloader is still available for fallback runs:
python -m podcasts.tts.download_samples
Mispronounced word
Add an entry to podcasts/tts/lexicon.txt with the correct pronunciation and regenerate the episode.
Script quality issues
If generated scripts miss concepts or have formatting issues:
- Regenerate
podcasts/bundles/*.mdfrom the current source material - Edit the script manually in
podcasts/scripts/before generating audio - Check that the final script uses only
[ALEX],[JAMIE], and[PAUSE]markers - Keep the Alex/Jamie banter style while making the teaching accurate
Audio sounds robotic or unnatural
- Try different Kokoro voices with
--male-voiceand--female-voice - Adjust the generator options in
podcasts/tts/generate_all_kokoro.pyfor spacing and chunking - Add pronunciation fixes to
podcasts/tts/lexicon.txt
Cost Summary
| Step | Tool | Cost |
|---|---|---|
| Bundle generation | Local Node.js script | Free |
| Script drafting | Manual or AI-assisted, reviewed before commit | Depends on chosen tool |
| Audio synthesis | Local Kokoro TTS | Free |
| Site and RSS generation | Local Node.js script | Free |