andimarafioti / nano-parakeet

Public

Pure-PyTorch Parakeet TDT inference

100% credibility

Found Feb 28, 2026 at 19 stars -- GitGems finds repos before they trend. Get early access to the next one.

AI Analysis

Python

AI Summary

nano-parakeet is a lightweight Python library providing fast, dependency-minimal inference for NVIDIA's Parakeet speech-to-text model using pure PyTorch.

How It Works

🔍 Discover nano-parakeet

You learn about a simple tool that turns audio recordings of people speaking into written text super quickly and accurately.

🛠️ Set up the tool

You easily add this lightweight helper to your computer so it's ready to handle speech-to-text tasks.

🎤 Pick your audio

You select a voice recording, like a meeting note or podcast clip, that you want to convert to text.

✨ Start transcribing

You give the tool your audio file and it instantly processes the speech into readable words.

⚡ See the speed

Everything happens in seconds, much faster than other similar tools, giving you results right away.

✅ Enjoy your text

You now have the complete written version of the spoken words, perfect for reading, sharing, or editing.

Sign up to see the full architecture

4 more

Star Growth

See how this repo grew from 19 to 19 stars Sign Up Free

Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose

AI-Generated Review

What is nano-parakeet?

nano-parakeet delivers pure-PyTorch inference for NVIDIA's Parakeet TDT speech-to-text model in Python, ditching the official NeMo framework entirely. You get a slim package with just five dependencies—torch, numpy, soundfile, sentencepiece, huggingface-hub—that loads 1.1GB weights from Hugging Face and transcribes 16kHz mono audio files via a one-liner API or CLI command like `nano-parakeet audio.wav`. It solves NeMo's bloat: no version conflicts, no 30-second cold starts, just byte-identical transcriptions ready to drop into your project.

Why is it gaining traction?

It slashes deps from 180 to 5, cuts cold starts to 3 seconds, and boosts warm RTF up to 2.5x on RTX 4090 or 1.3x on Jetson AGX Orin, per included benchmarks you can run yourself. Supports OGG/WAV/M4A via ffmpeg, handles numpy/tensor inputs, and even offers optional timestamps for chars/words/segments. Developers love the no-fuss install (`pip install nano-parakeet`) and Jetson tweaks without rebuilding PyTorch.

Who should use this?

Python devs embedding fast STT in web apps, real-time pipelines, or serverless functions where NeMo's overhead kills deployability. Edge ML engineers on Jetson devices needing sub-100ms latency transcription. Audio tool builders wanting Hugging Face integration without framework lock-in.

Verdict

Grab it if NeMo frustrates you—benchmarks hold up, API is clean, MIT-licensed beta works out of the box on CUDA. With 18 stars and 1.0% credibility score, it's early and unproven at scale; test thoroughly before prod.

(198 words)

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.

Stars

Forks

367

Followers

Base stars: 19 stars

Bonus: AI verified quality (100%)

Account age: 4,160 days

Repo age: 7 days

License: MIT

Updated: Feb 28, 2026