ivan-digital / qwen3-asr-swift

Public

AI speech toolkit for Apple Silicon — ASR, TTS, speech-to-speech, VAD, and diarization powered by MLX and CoreML

blog.ivan.digitalqwen3-asr-swift-on-device-asr-tts-for-apple-silicon-architecture-and-benchmarks-27cbf1e4463f

118

100% credibility

Found Feb 06, 2026 at 19 stars 6x -- GitGems finds repos before they trend. Get early access to the next one.

AI Analysis

Swift

AI Summary

A tool that converts spoken audio into text using a powerful recognition model designed to run smoothly on Apple Silicon devices supporting dozens of languages.

How It Works

🔍 Discover speech-to-text magic

You hear about a free tool that turns any audio recording into written text, understanding 52 languages right on your Apple Mac or iPhone.

📱 Get it ready on your Mac

Follow the easy guide to set up the tool on your Apple computer, no fancy skills needed.

🌍 Pick your voice model

Choose a small or powerful version that handles noise and dialects perfectly for your needs.

🎤 Choose your audio

Grab any sound file like a voice memo, interview, or podcast clip from your files.

✨ Listen and transcribe

Hit go, and the tool hears your audio, thinks fast, and spits out the words as readable text.

✅ Enjoy perfect results

See the full, accurate transcription appear instantly, ready to copy, edit, or share with friends.

Sign up to see the full architecture

4 more

Star Growth

See how this repo grew from 19 to 118 stars Sign Up Free

Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose

AI-Generated Review

What is qwen3-asr-swift?

This GitHub Swift repo delivers a pure Swift implementation of Alibaba's Qwen3-ASR speech-to-text model, optimized for Apple Silicon using MLX Swift. It transcribes audio in 52 languages (30 major + 22 Chinese dialects) with top noise robustness, processing 10-second clips in ~0.6s on M-series chips via simple API calls or CLI. Import it only via Swift Package Manager into your Apple app for on-device ASR—no Python or cloud needed.

Why is it gaining traction?

It crushes Whisper in noisy environments (17% vs 63% WER) at half the size, with RTF under 0.07 for real-time feel on Macs/iPhones. Native Swift means seamless GitHub Actions CI, no FFI hassles, and streaming transcription support. Devs dig the auto-downloading models and cache control for quick prototypes.

Who should use this?

iOS/macOS devs building voice notes, meeting transcribers, or podcasts apps on Apple hardware. Swift GitHub projects needing local ASR without server costs, especially for multilingual or noisy-audio use cases like field recordings. Teams using Swift GitHub Copilot for rapid voice UI integration.

Verdict

Grab it for Apple ASR experiments—solid README, CLI (`qwen3-asr-cli audio.wav`), and tests make it dev-ready despite 29 stars and 1.0% credibility score. Early maturity means watch for streaming polish, but it's a smart Swift implementation guide for on-device ML.

(198 words)

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.

118

Stars

Forks

Followers

Base stars: 118 stars

Bonus: AI verified quality (100%)

Account age: 2,756 days

Repo age: 29 days

License: Apache-2.0

Updated: Mar 05, 2026