xuiltul / voice-input

Public

Local voice input with screen-aware context. Push-to-talk → Whisper → LLM refinement, all on your own GPU.

100% credibility

Found Feb 12, 2026 at 36 stars -- GitGems finds repos before they trend. Get early access to the next one.

AI Analysis

Python

AI Summary

A privacy-focused voice-to-text tool for Mac that uses push-to-talk to record speech, transcribes it locally, refines the text using what's on your screen, and pastes the result automatically.

How It Works

🔍 Discover easy voice typing

You hear about a simple tool that lets you speak to type on your Mac, keeping everything private on your computer without sending data online.

📥 Get it ready on your Mac

You download the tool and set it up quickly with a few easy steps, like installing helpers that run locally.

🔓 Allow microphone and screen access

You go to your Mac settings and give permission for the tool to hear you and see your screen to make typing smarter.

▶️ Launch your voice helper

You open two small windows in separate spots, and everything connects smoothly on your own machine.

🎤 Hold Option key and talk naturally

Hold down the left Option key while speaking casually—even with your screen in view for better understanding—and watch a helpful display show progress.

✨ Release and text appears polished

Let go of the key, and clean, punctuated text with fixed words magically pastes where you're typing, ready to send if you want.

🎉 Type faster and privately

Now you dictate emails, notes, or chats effortlessly with accurate results tailored to your screen, all staying safe on your Mac.

Sign up to see the full architecture

5 more

Star Growth

See how this repo grew from 36 to 43 stars Sign Up Free

Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose

AI-Generated Review

What is voice-input?

Voice-input is a Python tool for local voice-to-text on your GPU, turning push-to-talk audio (hold Option on Mac) into refined, punctuated text that auto-pastes into any app. It uses Whisper for transcription, a local LLM via Ollama for cleanup like removing fillers, and a vision model to grab screen context for better accuracy on technical terms. No cloud involved—your audio and screenshots stay private, solving clunky keyboard voice input for devs who hate typing.

Why is it gaining traction?

Unlike cloud services that leak data, this runs everything locally as a WebSocket server or HTTP API, with Docker support for GPU rigs. Screen-aware context and multi-language auto-detection (ja/en/zh/ko) make it shine for precise dictation, plus slash commands for quick actions like "/commit". It's a local voice LLM alternative to remote tools, hooking devs tired of laggy, privacy-risky voice input.

Who should use this?

Mac devs dictating code comments or GitHub PRs during pair programming, especially Japanese speakers needing fast local voice AI. Linux GPU users building local voice assistants or home setups without cloud deps. Anyone evaluating local GitHub Copilot alternatives for hands-free workflows in terminals or IDEs.

Verdict

Grab it if you have Apple Silicon (16GB+) or an NVIDIA GPU—setup is straightforward with solid docs and Ollama integration, delivering real privacy wins despite 17 stars and 1.0% credibility score signaling early maturity. Test on short clips first; lacks broad testing but punches above for local voice input on GitHub.

(198 words)

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.

Stars

Forks

Followers

Base stars: 43 stars

Bonus: AI verified quality (100%)

Account age: 4,598 days

Repo age: 20 days

License: MIT

Updated: Feb 27, 2026