dseditor / QwenASRMiniTool

Public

基於OpenVino-int8權重，精簡的QwenASR小工具，用於即時辨識以及字幕轉換使用

204

100% credibility

Found Feb 20, 2026 at 55 stars 4x -- GitGems finds repos before they trend. Get early access to the next one.

AI Analysis

Python

AI Summary

A Windows desktop app that converts audio files or live microphone speech into SRT subtitles with multi-language support, speaker identification, and optional accuracy-boosting hints, running entirely on CPU.

How It Works

🔍 Discover the tool

You find a simple app that turns spoken words in audio files or from your microphone into easy-to-read subtitles.

💻 Download and open

Grab the program from the sharing site and launch it on your Windows computer – it's ready to use right away.

📥 Set it up once

The app asks for a folder to store its helpers; it downloads them automatically so anyone can use it without hassle.

✅ All set and waiting

A green checkmark appears, showing everything is loaded and ready on your regular computer processor.

Choose your way

📁

Use an audio file

Pick your MP3, WAV, or other recording, add optional hints like lyrics, choose if speakers should be labeled, and start.

🎤

Speak live

Select your microphone, add hints if helpful, and begin talking – it listens and types as you pause.

▶️ Start the magic

Hit the button to begin; watch the progress as it breaks your audio into timed words.

🎉 Get your subtitles

Open the folder to find your new subtitle file with perfect timing, language support, and speaker names if chosen – ready for videos or notes.

Sign up to see the full architecture

5 more

Star Growth

See how this repo grew from 55 to 204 stars Sign Up Free

Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose

AI-Generated Review

What is QwenASRMiniTool?

QwenASRMiniTool is a Python desktop app that turns audio files or live mic input into timed SRT subtitles using the quantized Qwen3-ASR model via OpenVINO on CPU. It handles formats like MP3, WAV, and M4A, supports 30 languages including Chinese, Japanese, and English with Traditional Chinese output, and auto-downloads a 1.2GB model on first run. Developers get a portable GUI for quick transcription without GPU setup or cloud dependencies.

Why is it gaining traction?

It stands out with built-in speaker diarization for podcasts—labeling speakers in SRT output—and recognition prompts like lyrics to boost accuracy on tricky audio. Real-time mode processes mic input on speech pauses, making it practical for live demos, while VAD silence detection keeps segments clean. The one-click EXE build and Windows focus lower barriers compared to heavier ASR frameworks.

Who should use this?

Podcasters transcribing interviews, video editors generating subtitles for multilingual clips, or educators captioning lectures. Python scripters prototyping local ASR pipelines, especially those avoiding NVIDIA dependencies, or hobbyists subtitling songs with custom prompts.

Verdict

Grab it if you need dead-simple CPU speech-to-text in Python—docs are solid and features pack a punch for 49 stars. Low 1.0% credibility score flags early maturity, so test on your audio first before production.

(178 words)

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.

204

Stars

Forks

Followers

Base stars: 204 stars

Bonus: AI verified quality (100%)

Account age: 1,621 days

Repo age: 12 days

Updated: Mar 04, 2026