huseyinstif

If it's on the screen, it's an API. Control any desktop app via REST + MCP. Rust.

11
1
100% credibility
Found Mar 07, 2026 at 11 stars -- GitGems finds repos before they trend. Get early access to the next one.
Sign Up Free
AI Analysis
Rust
AI Summary

OculOS is a cross-platform desktop tool that reads an application's user interface structure using built-in system features and provides a web dashboard and protocol for inspecting and automating interactions with buttons, fields, and menus.

How It Works

1
🔍 Discover OculOS

You hear about a handy tool that lets you peek inside any app on your computer and control buttons, menus, and text fields just like magic.

2
🛠️ Get it running

Download the tiny program and launch it—it starts a helper that watches your screen's elements safely.

3
👁️ Allow screen access

On your computer settings, give quick permission so it can read what's on screen, like approving a helpful assistant.

4
🌐 Open the dashboard

Visit a simple web page on your own computer to see a list of all open apps and their inner workings.

5
Choose how to interact
🖱️
Manual control

Browse the app's structure, test clicks and typing right in the inspector tool.

🤖
Smart automation

Connect your AI companion to make it handle tasks like searching and playing songs in music apps.

🚀 Apps under your command

Now you effortlessly automate everyday tasks, like controlling apps without lifting a finger.

Sign up to see the full architecture

4 more

Sign Up Free

Star Growth

See how this repo grew from 11 to 11 stars Sign Up Free
Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose
AI-Generated Review

What is oculos?

Oculos is a Rust daemon that exposes every UI element in any desktop app—buttons, text fields, menus—as structured JSON REST API endpoints, using your OS accessibility tree. Fire curl requests to list windows, query elements, click, type, toggle, or scroll, all without screenshots or pixel math. It also runs an MCP server for AI agents like Claude to control apps autonomously, plus a built-in dashboard for inspection and action recording to curl/JS/Python.

Why is it gaining traction?

Unlike vision agents needing GPUs and seconds per action, or coord-based tools that break on resizes, oculos delivers instant, deterministic control via native semantics. The dashboard shines for debugging trees live, with WebSocket events and a recorder that exports real scripts. MCP setup plugs straight into Claude Desktop or Cursor, making AI-driven desktop automation dead simple—no custom prompts needed.

Who should use this?

Automation scripters replacing flaky pyautogui or AutoHotkey with reliable APIs for cross-app workflows. AI devs building agents that search Spotify, fill forms in Electron apps, or navigate Qt tools. QA engineers scripting UI tests in CI for Win32/WPF/GTK desktops, dodging Selenium's browser limits.

Verdict

Early gem at 11 stars and 1.0% credibility—docs are pro, binary flies, but test perms (macOS needs System Settings tweak) and edge cases yourself. Star it if github it tools for it screen connect or share fit your stack; roadmap SDKs could make it a staple.

(178 words)

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.