Frikallo

Ultra fast and portable Parakeet implementation for on-device inference in C++ using Axiom with MPS+Unified Memory

218
7
100% credibility
Found Feb 27, 2026 at 190 stars -- GitGems finds repos before they trend. Get early access to the next one.
Sign Up Free
AI Analysis
C++
AI Summary

A lightweight C++ tool for running NVIDIA's Parakeet speech recognition models to transcribe audio files quickly on Apple Silicon.

How It Works

1
🎤 Discover fast speech-to-text

You hear about a speedy way to turn audio recordings into written words right on your Mac.

2
📥 Get the ready-to-use files

Download the simple tool and matching voice model files from the trusted source.

3
🔧 Set it up quickly

Follow the easy steps to prepare everything so it's ready to listen to your audio.

4
Feed in your audio

Play your recording—like a meeting or podcast—and it instantly converts speech to text.

5
📝 View the results

Read the accurate transcription with word timings to see exactly when each part was said.

Talk turned into text magic

Now transcribe any audio effortlessly, saving hours of manual typing.

Sign up to see the full architecture

4 more

Sign Up Free

Star Growth

See how this repo grew from 190 to 218 stars Sign Up Free
Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose
AI-Generated Review

What is parakeet.cpp?

parakeet.cpp ports NVIDIA's Parakeet speech models to pure C++ for ultra-fast on-device ASR inference, like whisper.cpp meets parakeet but optimized for Apple Silicon GPUs via Axiom. Feed it 16kHz WAV files via CLI or simple API—`Transcriber t("model.safetensors", "vocab.txt"); auto result = t.transcribe("audio.wav");`—and get text, timestamps, or speaker diarization. Supports offline/multilingual transcription, streaming with low latency, and CUDA/MPS backends, no Python or ONNX needed.

Why is it gaining traction?

It crushes benchmarks: 27ms encoder time on M3 GPU for 10s audio (96x CPU speedup), outrunning PyTorch MPS while staying portable. Devs love the dead-simple model conversion from NeMo checkpoints, greedy decoding with word timestamps, and streaming chunks for real-time apps—pure C++ means embed anywhere without runtime bloat.

Who should use this?

iOS/macOS devs building local voice assistants or meeting note-takers needing ultra-fast STT without cloud latency. Real-time transcription tools, like podcast editors or call analyzers, will dig the diarization and configurable streaming latency. Anyone ditching heavy frameworks for on-device inference on Apple hardware.

Verdict

Grab it for blazing local ASR if you're on Apple Silicon—CLI and API make prototyping instant. Low 44 stars and 1.0% credibility reflect early days (solid docs/benchmarks, but expand tests), yet speed hooks like ultra fast shader ports; production-ready for speed-critical prototypes.

(198 words)

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.