antirez

Pure C inference of Mistral Voxtral Realtime 4B speech to text model

1,456
90
100% credibility
Found Feb 06, 2026 at 285 stars 5x -- GitGems finds repos before they trend. Get early access to the next one.
Sign Up Free
AI Analysis
C
AI Summary

A standalone C program that runs Mistral's Voxtral speech-to-text model locally to transcribe audio files, live microphone input, or streaming audio with minimal dependencies.

How It Works

1
🔍 Discover Voxtral C

You hear about a super-fast way to turn speech into text that runs right on your computer, no fancy setups needed.

2
📥 Grab the files

Download the simple program files to your computer.

3
⚙️ Ready in seconds

Run one easy command to prepare it for your Mac or Linux machine – it picks the best speed option automatically.

4
📦 Add the voice smarts

Download the voice understanding files with another quick command.

5
Pick your input
🎵
Use a recording

Point it at any audio file like a podcast or voice note.

🗣️
Speak live

Start talking into your microphone and see text appear instantly.

6
Watch text appear

Play your sound or speak, and words stream out word-by-word in real time.

Perfect speech text

Enjoy accurate transcripts of any speech, ready to copy or save, super fast and private on your own machine.

Sign up to see the full architecture

5 more

Sign Up Free

Star Growth

See how this repo grew from 285 to 1,456 stars Sign Up Free
Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose
AI-Generated Review

What is voxtral.c?

voxtral.c delivers pure C inference for Mistral's Voxtral Realtime 4B speech-to-text model, turning WAV files, live mic input (macOS), or piped streams into streaming text transcripts without Python, CUDA, or vLLM dependencies. Pipe any audio via ffmpeg to stdin for real-time transcription, or use the C streaming API to feed chunks and pull tokens as they generate. Like voxtral cpp or voxtral cpu ports, it handles unlimited audio via rolling KV cache and supports Apple Metal acceleration.

Why is it gaining traction?

Zero external deps beyond stdlib echo pure C gems like github pure bash bible or voxtral llama cpp, making it dead simple to build and deploy on desktops or servers—no runtime headaches. Metal backend hits real-time speeds on Apple Silicon (2.5x faster than audio), with BLAS fallback for Linux/Intel; CLI flags tune latency via processing intervals. Devs dig the self-contained Python reference for voxtral colab tweaks and alt-token output for spotting ambiguities.

Who should use this?

Apple devs building voxtral chat apps or local voice UIs, embedded engineers needing voxtral chinese/code transcription without frameworks, or backend teams piping podcasts via ffmpeg for pure storage ai inference. Ideal for voxtral cpu prototypes before scaling to GPU.

Verdict

Grab it for lightweight, local STT—builds in seconds, runs fast on M-series. 505 stars and 1.0% credibility score reflect early days (needs long-audio tests per docs), but antirez pedigree and MIT license make it a low-risk playground.

(198 words)

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.