yeahhe365

Browser-based Android phone agent using WebADB/WebUSB and OpenAI-compatible vision models

11
4
85% credibility
Found May 21, 2026 at 11 stars -- GitGems finds repos before they trend. Get early access to the next one.
Sign Up Free
AI Analysis
TypeScript
AI Summary

WebDroid Agent is a browser-based tool that lets you control an Android phone using AI. You connect your phone to Chrome with a USB cable, then describe a task in plain language—like 'open Settings and find Wi-Fi.' The app shows your phone's screen to an AI, which decides what to tap, swipe, or type. It executes each action on your phone, takes a new screenshot, and repeats until the task is done. The app includes safety features like requiring confirmation for sensitive actions, stopping after a set number of steps, and letting you stop the run at any time. It's designed for experimenting with AI-powered phone automation in a local, controlled environment—not for handling payments, logins, or other sensitive tasks.

How It Works

1
🔍 You hear about browser-controlled phones

A friend tells you about a tool that can control an Android phone right from Chrome, using AI to understand what it sees on the screen.

2
📱 You plug in your Android phone

Using a USB cable, you connect your Android phone to your computer and enable debugging mode on the phone. Chrome asks for permission to talk to your device.

3
🤖 You connect your AI assistant

You enter the address of your AI service and your personal key. The app remembers these settings for next time.

4
You describe what you want done

In plain English, you type something like 'Open Settings and find the Wi-Fi page.' The app takes a screenshot and shows it to the AI.

5
👀 The AI looks at your phone screen

The AI studies the screenshot, understands the layout, and decides what action to take next—like tapping a button or opening an app.

6
The action runs automatically or waits for you
▶️
Auto mode

Safe actions run automatically while the app shows you each step

⏸️
Step-by-step mode

You review and approve every action before it runs on your phone

7
🔄 The loop continues until done

After each action, the app takes a new screenshot and asks the AI what to do next. This repeats until your task is complete or you stop it.

🎉 Your task is complete

The AI finishes what you asked, or asks you to take over if it hits something it can't handle—like entering a password. You can export a log of everything that happened.

Sign up to see the full architecture

6 more

Sign Up Free

Star Growth

See how this repo grew from 11 to 11 stars Sign Up Free
Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose
AI-Generated Review

What is WebDroid-Agent?

WebDroid-Agent is a browser-based Android automation tool that lets you control a real phone through Chrome or Edge using WebUSB. It captures the device screen, sends it to any OpenAI-compatible vision model, and executes the model's decisions by running ADB commands directly on the phone. Built in TypeScript with React and Vite, it runs entirely in the browser with no backend required. You give it a task like "Open Settings and navigate to Wi-Fi," and it loops through capture-analyze-act until the job is done or it hits a limit you set.

Why is it gaining traction?

The hook is zero-infrastructure automation. You do not spin up a server, configure ADB over network, or install anything beyond npm and a USB cable. Everything happens in a single React app, and it works with any vision model that speaks the OpenAI API format. The project also natively supports two action protocols: canonical JSON for general models and Open-AutoGLM style for specific Chinese phone-agent models. Safety controls like step limits, sensitive-action confirmation dialogs, and manual takeover requests make it practical for local experimentation without accidentally wiping a device.

Who should use this?

Researchers testing whether a vision model can understand real mobile UIs. Developers prototyping Android automation workflows without setting up Appium or similar infrastructure. Anyone exploring phone-agent architectures who wants a quick, local loop without cloud dependencies. It is not suitable for production workloads, multi-device orchestration, or handling logins, payments, or account settings.

Verdict

This is a legitimate experimental toolkit for the phone-agent problem space. The credibility score of 0.85% reflects the early stage: 11 stars, thin test coverage, and documentation that assumes familiarity with WebUSB and ADB. The architecture is sound and the codebase is well-organized, but real-device validation is still needed to confirm reliability across Android versions. Worth exploring if you are building in this domain, but treat it as a starting point rather than a finished product.

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.