Frikallo / parakeet.cpp

Public

Ultra fast and portable Parakeet implementation for on-device inference in C++ using Axiom with MPS+Unified Memory

asr automatic-speech-recognition axiom nvidia parakeet

218

100% credibility

Found Feb 27, 2026 at 190 stars -- GitGems finds repos before they trend. Get early access to the next one.

AI Analysis

C++

AI Summary

A lightweight C++ tool for running NVIDIA's Parakeet speech recognition models to transcribe audio files quickly on Apple Silicon.

How It Works

🎤 Discover fast speech-to-text

You hear about a speedy way to turn audio recordings into written words right on your Mac.

📥 Get the ready-to-use files

Download the simple tool and matching voice model files from the trusted source.

🔧 Set it up quickly

Follow the easy steps to prepare everything so it's ready to listen to your audio.

⚡ Feed in your audio

Play your recording—like a meeting or podcast—and it instantly converts speech to text.

📝 View the results

Read the accurate transcription with word timings to see exactly when each part was said.

✅ Talk turned into text magic

Now transcribe any audio effortlessly, saving hours of manual typing.

Sign up to see the full architecture

4 more

Star Growth

See how this repo grew from 190 to 218 stars Sign Up Free

Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose

AI-Generated Review

What is parakeet.cpp?

parakeet.cpp ports NVIDIA's Parakeet speech models to pure C++ for ultra-fast on-device ASR inference, like whisper.cpp meets parakeet but optimized for Apple Silicon GPUs via Axiom. Feed it 16kHz WAV files via CLI or simple API—`Transcriber t("model.safetensors", "vocab.txt"); auto result = t.transcribe("audio.wav");`—and get text, timestamps, or speaker diarization. Supports offline/multilingual transcription, streaming with low latency, and CUDA/MPS backends, no Python or ONNX needed.

Why is it gaining traction?

It crushes benchmarks: 27ms encoder time on M3 GPU for 10s audio (96x CPU speedup), outrunning PyTorch MPS while staying portable. Devs love the dead-simple model conversion from NeMo checkpoints, greedy decoding with word timestamps, and streaming chunks for real-time apps—pure C++ means embed anywhere without runtime bloat.

Who should use this?

iOS/macOS devs building local voice assistants or meeting note-takers needing ultra-fast STT without cloud latency. Real-time transcription tools, like podcast editors or call analyzers, will dig the diarization and configurable streaming latency. Anyone ditching heavy frameworks for on-device inference on Apple hardware.

Verdict

Grab it for blazing local ASR if you're on Apple Silicon—CLI and API make prototyping instant. Low 44 stars and 1.0% credibility reflect early days (solid docs/benchmarks, but expand tests), yet speed hooks like ultra fast shader ports; production-ready for speed-critical prototypes.

(198 words)

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.

218

Stars

Forks

Followers

Base stars: 218 stars

Bonus: AI verified quality (100%)

Account age: 1,660 days

Repo age: 5 days

License: MIT

Updated: Mar 02, 2026