Podcast Audio Pipeline

This directory contains the complete pipeline for producing the Git Going with GitHub audio series: 54 companion episodes of two-host conversational content designed for blind and low-vision developers, plus 21 Challenge Coach episodes placed near the chapters they support.

Pipeline Overview

build-bundles.js     Generate source bundles from chapter content
        |
        v
  bundles/*.md       One bundle per episode (production prompt + source material)
        |
        v
  scripts/*.txt      Conversational scripts with [ALEX]/[JAMIE]/[PAUSE] markers
        |
        v
  transcripts/*-chapters.json  Ordered chapter plans by segment index
        |
        v
  tts/               Local neural TTS (ONNX models, pronunciation lexicon)
        |
        v
      audio/*.(wav|mp3)  Final episode files (gitignored - hosted on GitHub Releases)
        |
        v
generate-site.js     Build PODCASTS.md player page and RSS feed from manifest

Audio is generated locally using ONNX text-to-speech models. No cloud APIs, no API keys, no billing. Runs entirely on your machine.

Use the unified audio command as the front door. It defaults to Kokoro, the production audio path. Piper remains available only when explicitly selected for fallback or comparison. See Piper Fallback TTS for setup, scope, and validation guidance.

python -m podcasts.tts.generate_audio --audio-format mp3
python -m podcasts.tts.generate_audio --engine kokoro --audio-format mp3
python -m podcasts.tts.generate_audio --engine piper --audio-format mp3

podcasts/config/listening-order.json controls the public listening path. It interleaves companion lessons, Challenge Coach episodes, and reference episodes so podcast apps and the generated player page present the workshop as one end-to-end experience. The audio generators and inventory tools use the same order, so generation queues match the learner path.

Final episode output format is configurable through podcasts/tts/voice-config.ini (episode_audio_format = wav|mp3|both). MP3 generation requires ffmpeg on your PATH.

After all MP3 files and segment manifests are generated, run the metadata pass to add ID3 tags, embed the source script, and derive smart chapter markers for each episode. The metadata pass defaults to a dry run so it can verify the full 75-file set before touching audio.

Directory Structure

The following tree separates source files, generated artifacts, local caches, and legacy helpers.

podcasts/
  README.md                         This guide
  REGENERATION.md                   Full regeneration runbook
  config/listening-order.json       Canonical public listening path
  lib/listening-plan.js             Shared JavaScript listening-order resolver
  listening_plan.py                 Shared Python listening-order resolver
  build-bundles.js                  Companion episode catalog and bundle generator
  build-challenge-bundles.js        Challenge Coach catalog and bundle generator
  generate-draft-transcripts.js     Reviewable Alex/Jamie script generator
  generate-site.js                  Generates https://lp.csedesigns.com/ggg/PODCASTS.html and feed.xml
  validate-catalog.js               Validates source coverage
  validate-listening-order.js       Validates the complete public listening path
  validate-feed.js                  Validates RSS structure and enclosures
  verify_audio_inventory.py         Validates scripts, transcripts, manifests, and MP3s
  tag-audio-metadata.py             Writes ID3 metadata and chapter markers
  manifest.json                     Companion episode metadata
  feed.xml                          Generated RSS feed
  bundles/                          Generated companion prompt packets
  challenge-bundles/                Generated Challenge Coach prompt packets
  scripts/                          Committed transcript source scripts
      transcripts/                      Derived segment JSON transcripts and chapter plans
      chapters/                         Podcasting 2.0 chapter JSON sidecars written during metadata tagging
  audio/                            Local audio output and segment cache, not committed
  logs/                             Local generation and inventory reports
      tools/agentic-pilot/              One-episode packet builder and transcript evaluation helpers
  tools/legacy/                     Older diagnostic and one-off helpers
  tts/                              Production and fallback TTS package

Prerequisites

Python 3.10 or later
Kokoro TTS: pip install kokoro-onnx soundfile numpy
FFmpeg on your PATH for MP3 conversion
Mutagen for MP3 ID3 metadata: pip install mutagen
Node.js 18 or later (for bundle/site generation only)

For full regeneration guidance after curriculum changes, see Podcast Regeneration Runbook.

Quick Start

1. Install Kokoro and metadata tooling

pip install kokoro-onnx soundfile numpy
pip install mutagen

2. Download voice models (if not already present)

python -m podcasts.tts.download_kokoro_samples --english-high-quality-only

This downloads the Kokoro ONNX model and voices file to podcasts/tts/models/.

3. Validate the catalog and generate fresh local bundles

npm run validate:podcasts
npm run build:podcast-bundles
npm run build:podcast-challenge-bundles
npm run generate:podcast-transcripts

podcasts/bundles/*.md files are generated prompt packets. They are intentionally ignored by git and should be regenerated when needed. podcasts/challenge-bundles/*.md files are the same kind of generated prompt packet, but scoped to individual Challenge Coach episodes.

Transcript generation now writes three artifacts for each episode or challenge:

the transcript source in podcasts/scripts/
the segment manifest in podcasts/transcripts/*-segments.json
the ordered chapter plan in podcasts/transcripts/*-chapters.json

The chapter plan is sequential, not time-based. It stores chapter titles with segment indexes. Later, the metadata pass converts those ordered boundaries into timed ID3 chapter markers and Podcasting 2.0 chapter sidecars after audio generation.

You can also regenerate a subset instead of rebuilding all 75 scripts:

npm run generate:podcast-transcript -- --slug ep05-working-with-issues
npm run generate:podcast-transcript -- --start 1 --end 4 --group challenges
npm run generate:podcast-transcript -- --start 20 --end 25 --group appendices

4. Generate all episodes

Preview the listening-order generation queue without loading models or creating audio:

npm run podcast:audio:queue

Then generate audio when you are ready:

python -m podcasts.tts.generate_audio --audio-format mp3

This batch command processes the full committed script set: all ep*.txt companion episodes plus all cc-*.txt Challenge Coach and bonus episodes.

Or generate a single episode:

python -m podcasts.tts.generate_audio --start 0 --end 0 --force --audio-format mp3
python -m podcasts.tts.generate_audio --start 5 --end 5 --force --audio-format mp3

Or a range:

python -m podcasts.tts.generate_audio --start 0 --end 10 --audio-format mp3

5. Build player page and RSS feed

npm run build:podcast-site

Voice Configuration

The default voices are:

Host	Kokoro Voice	Character	Description
Alex	am_liam	Lead host, experienced, warm	Male, polished delivery with stronger presence
Jamie	af_jessica	Co-host, curious, energetic	Female, clear and natural delivery

Listen to samples in podcasts/tts/samples/ to try different voices.

To change Kokoro voices, pass --male-voice and --female-voice through the unified command: python -m podcasts.tts.generate_audio --engine kokoro --male-voice am_liam --female-voice af_jessica.

If delivery feels too slow or too fast, tune Kokoro pacing directly:

python -m podcasts.tts.generate_audio --engine kokoro --speech-speed 1.08 --pause-seconds 1.0 --inter-segment-seconds 0.18 --inter-speaker-seconds 0.28

Pitch can be configured independently per host in podcasts/tts/voice-config.ini:

male_pitch_semitones = -1.0
female_pitch_semitones = 0.8

For backwards compatibility, pitch_semitones is still accepted and applies the same shift to both voices.

Pronunciation Lexicon

The file podcasts/tts/lexicon.txt contains pronunciation overrides for technical terms, acronyms, and jargon. The lexicon is applied as text substitution before Kokoro synthesizes each segment.

Format: one entry per line, tab-separated WORD<tab>REPLACEMENT. Lines starting with # are comments.

Example entries:

WCAG    W-Cag
NVDA    N V D A
GitHub  Git Hub
JSON    Jason

Add new entries when Kokoro mispronounces a word. The lexicon is loaded once per run and uses word-boundary matching so entries like GUI do not affect words like "guidelines".

Manifest Status Flow

Each episode in manifest.json progresses through these statuses:

bundle-ready  -->  script-ready  -->  audio-ready  -->  published
(build-bundles)    (scripts/)         (tts/)            (GitHub Release)

All npm Scripts

The following table lists the supported podcast build and validation commands.

Command	What It Does
`npm run validate:podcasts`	Validate episode catalog source mappings and the complete listening order
`npm run build:podcast-bundles`	Generate source bundles from chapters
`npm run build:podcast-challenge-bundles`	Generate source bundles for Challenge Coach episodes
`npm run generate:podcast-transcripts`	Replace old scripts with fresh reviewable Alex/Jamie draft transcripts
`npm run generate:podcast-transcript -- --slug <slug>`	Regenerate one selected transcript or a filtered range using `--start`, `--end`, and `--group`
`npm run podcast:agentic:packet -- --slug <slug>`	Build a single episode source packet for rewrite and review workflows
`npm run podcast:agentic:stage -- --slug <slug> --input <file>`	Save a retryable candidate rewrite under `logs/agentic-pilots/candidates/<slug>/attempt-###.txt`
`npm run podcast:agentic:catalog`	Run full-catalog coverage/style/repetition evaluation with per-episode reports
`npm run podcast:agentic:promote -- --slug <slug>`	Promote an accepted candidate transcript into the live script path only when all gates pass
`npm run podcast:chapters:normalize`	Normalize generated chapter-plan sidecars to remove weak or overly generic titles
`npm run podcast:chapters:audit`	Audit all generated chapter-plan sidecars and report title quality across the full catalog
`npm run build:podcast-transcripts`	Run validation, regenerate bundles, regenerate transcripts, and rebuild podcast page/feed
`npm run build:podcast-audio`	Generate MP3 audio for all companion, Challenge Coach, and bonus scripts with local Kokoro TTS
`npm run build:podcast-audio:piper`	Generate audio with the legacy local Piper TTS path
`npm run build:podcast-audio:kokoro`	Generate MP3 audio with the Kokoro TTS path
`npm run podcast:audio:queue`	Print the listening-order audio generation queue without creating MP3 files
`npm run build:podcast-transcripts-and-audio`	Run full transcript pipeline, generate audio, and rebuild podcast page/feed
`npm run build:podcast-site`	Build player page and RSS feed
`npm run podcast:metadata:check`	Dry-run validation that all 75 MP3s and matching scripts are present before tagging
`npm run podcast:metadata:write`	Write ID3 metadata, embed episode scripts and smart chapters, write chapter JSON sidecars, and touch all 75 MP3 files
`npm run build:podcasts`	Bundles + site
`npm run build`	Full build: podcasts + HTML site

Publishing Audio

Audio files are hosted on GitHub Releases (not in the repository, they are gitignored).

Generate all audio as MP3 files: npm run build:podcast-audio
Confirm all expected MP3 files exist: npm run podcast:metadata:check
Write ID3 tags, embed the source script, derive smart chapter markers, and touch each MP3: npm run podcast:metadata:write
Build the podcast page and RSS feed: npm run build:podcast-site
Create a GitHub Release tagged podcasts
Upload the MP3 files from podcasts/audio/ or the selected voice output folder as release assets
The RSS feed points to release asset URLs, links chapter JSON sidecars, and embeds clean script text in each item
Update manifest status to published and rebuild the site

Updating Episodes

When chapter content changes:

npm run validate:podcasts to catch missing source mappings and coverage gaps
npm run build:podcast-bundles to regenerate local chapter and appendix bundles
npm run build:podcast-challenge-bundles to regenerate local challenge bundles
npm run generate:podcast-transcripts to replace old scripts with fresh reviewable drafts
Review and edit the scripts in podcasts/scripts/
python -m podcasts.tts.generate_all_kokoro --start <number> --end <number> --force --audio-format mp3 to regenerate audio
npm run podcast:metadata:check to verify the complete MP3 set before tagging
npm run podcast:metadata:write to refresh ID3 tags, smart chapters, chapter JSON, and touch every generated MP3
npm run build:podcast-site to update the player page and RSS feed
Upload new audio to the GitHub Release
Commit reviewed scripts, metadata, generated site/feed files, and source changes

MP3 Metadata, Embedded Scripts, and Chapters

The metadata tool writes the following ID3 fields to each MP3:

Title: episode or challenge title
Artist, album artist, and publisher: Community Access
Album: Git Going with GitHub - Audio Series
Author website: Community Access website
Description: episode description or challenge focus
Episode script: the matching podcasts/scripts/*.txt source embedded as both a custom text frame and an unsynchronized lyrics frame
Smart chapters: ID3 chapter frames derived from podcasts/audio/segments/<episode>/manifest.json, preferring transcript-authored chapter plans from podcasts/transcripts/*-chapters.json when available
Chapter sidecars: Podcasting 2.0 JSON files in podcasts/chapters/, linked from RSS as podcast:chapters

Chapter markers now have a two-stage flow:

Transcript generation writes ordered chapter plans using segment indexes, while the lesson structure is still available.
Metadata tagging converts those segment indexes into timed chapter markers after audio generation.

If no transcript-authored chapter plan exists, the metadata tool falls back to the older pause-aware heuristic. That fallback starts with the opening segment, prefers natural boundaries after [PAUSE], avoids very short chapters, and forces a new marker when a section grows too long.

For one-episode agentic pilot work (automatic model selection), see podcasts/tools/agentic-pilot/README.md.

For a full-catalog refresh, the recommended sequence is:

npm run generate:podcast-transcripts
npm run podcast:chapters:normalize
npm run podcast:chapters:audit

Run the dry-run check first:

npm run podcast:metadata:check

Only after all 75 MP3 files and segment manifests exist, write tags, chapters, and refresh file modification times:

npm run podcast:metadata:write

If you are testing a partial batch intentionally, call the tool directly with --allow-missing and an explicit audio directory. Do not use that option for the final publishing pass.

Troubleshooting

Kokoro dependencies missing

Ensure Kokoro and audio dependencies are installed:

pip install kokoro-onnx soundfile numpy

Model not found

Download models first:

python -m podcasts.tts.download_kokoro_samples --english-high-quality-only

The legacy Piper model downloader is still available for fallback runs:

python -m podcasts.tts.download_samples

Mispronounced word

Add an entry to podcasts/tts/lexicon.txt with the correct pronunciation and regenerate the episode.

Script quality issues

If generated scripts miss concepts or have formatting issues:

Regenerate podcasts/bundles/*.md from the current source material
Edit the script manually in podcasts/scripts/ before generating audio
Check that the final script uses only [ALEX], [JAMIE], and [PAUSE] markers
Keep the Alex/Jamie banter style while making the teaching accurate

Audio sounds robotic or unnatural

Try different Kokoro voices with --male-voice and --female-voice
Adjust the generator options in podcasts/tts/generate_all_kokoro.py for spacing and chunking
Add pronunciation fixes to podcasts/tts/lexicon.txt

Cost Summary

Step	Tool	Cost
Bundle generation	Local Node.js script	Free
Script drafting	Manual or AI-assisted, reviewed before commit	Depends on chosen tool
Audio synthesis	Local Kokoro TTS	Free
Site and RSS generation	Local Node.js script	Free