hiveden

Multi-agent TTS production harness: Fish TTS + WhisperX + Claude, with cross-episode memory and auto-fix loop

19
7
100% credibility
Found Apr 01, 2026 at 19 stars -- GitGems finds repos before they trend. Get early access to the next one.
Sign Up Free
AI Analysis
JavaScript
AI Summary

An automated workflow that processes a script into speech audio using text-to-speech, validates quality via transcription and AI review, generates time-aligned subtitles, and produces final concatenated audio files.

How It Works

1
📖 Discover the TTS Magic Maker

You find this helpful tool that turns written scripts into spoken audio with perfectly timed subtitles.

2
🛠️ Set Up Your Workspace

Follow simple steps to prepare your computer with the everyday tools it needs, like a notepad app and sound player.

3
✍️ Prepare Your Story Script

Type your script into a simple list of scenes, like a outline of what characters say.

4
🔗 Link Voice and Checker Services

Connect a voice-creating service and a smart reviewer so it can speak your words and spot any slip-ups.

5
Launch and Watch the Magic

Click run, and it splits your text into bites, speaks each part, listens back, fixes errors automatically, adds subtitles, and joins it all into smooth audio.

6
👀 Review the Fun Preview

Open a webpage that plays your audio while subtitles appear right on cue, like a mini video test.

🎉 Enjoy Your Ready-to-Use Audio

You now have crisp spoken files with spot-on subtitles, perfect for videos or podcasts, all done effortlessly.

Sign up to see the full architecture

5 more

Sign Up Free

Star Growth

See how this repo grew from 19 to 19 stars Sign Up Free
Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose
AI-Generated Review

What is tts-agent-harness?

TTS Agent Harness is a JavaScript multi-agent production harness on GitHub that converts JSON scripts into TTS audio with time-aligned subtitles, tackling error-prone synthesis in mixed Chinese-English content. It chains Fish TTS for voice generation, WhisperX for transcription timestamps, and Claude in an auto-fix loop with cross-episode memory to catch and retry misreads like mangled names or numbers. Developers get ready-to-use per-shot WAVs, durations.json, subtitles.json, and browser previews from a single CLI run.

Why is it gaining traction?

Unlike basic TTS wrappers or LangGraph multi-agent GitHub experiments, this multi-agent system GitHub code delivers a deterministic pipeline with resume-from-step, parallel synthesis, and production-grade retries—up to 3 rounds per chunk—slashing manual fixes by 75% in tests. The auto-fix loop and memory for known issues make it a practical multi-agent platform GitHub alternative to copilot-style agents, especially for agent harness needs beyond PPO or SAC multi-agent GitHub repos.

Who should use this?

Video producers automating podcast or explainer scripts with TTS, especially those handling technical terms, brands, and bilingual audio. Content creators at startups building multi-agent pathfinding GitHub-like workflows for subtitles in tools like Remotion. JS devs scripting episode batches needing reliable, auditable output without constant QA.

Verdict

Grab it for prototyping TTS pipelines—solid docs, CLI like bash run.sh script.json episode, and e2e tests shine despite 19 stars and 1.0% credibility score. Too early for heavy production without your own scaling, but forkable multi-agent GitHub code worth watching.

(198 words)

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.