Purewhiter

MobileGym: A Verifiable and Scalable Simulation Environment for Mobile GUI Agent Research

29
0
94% credibility
Found May 27, 2026 at 29 stars -- GitGems finds repos before they trend. Get early access to the next one.
Sign Up Free
AI Analysis
TypeScript
AI Summary

MobileGym is an academic research platform that simulates a mobile phone environment in a web browser. It contains independently-built research versions of 28 popular apps (like messaging, payments, video, and social media) that never connect to real services. The platform is designed for AI researchers to train and evaluate "GUI agents" - AI systems that interact with phone interfaces by tapping, typing, and swiping. It includes 416 pre-made task challenges with deterministic verification, supports running 256 simulations in parallel, and has been validated to transfer skills learned in simulation to real Android devices. The project is clearly marked as research software, not affiliated with any of the real apps it simulates.

How It Works

1
🔬 Discover a Mobile AI Lab

You hear about a new research platform that lets AI assistants practice using phone apps in a safe, simulated environment.

2
📱 See 28 Familiar Apps

The platform includes fake versions of apps like messaging, payments, video, and social media - all built for research, not real use.

3
🤖 Watch AI Agents Learn

Researchers can train AI to complete tasks like sending messages, checking balances, or browsing content - with instant feedback on success.

4
Choose Your Path
📊
Test Existing Challenges

Run one of 416 ready-made tasks to see how well different AI models perform on mobile interactions.

🛠️
Create Custom Tasks

Write new scenarios and judges so the AI can practice skills you specifically need.

5
Run Hundreds at Once

The platform runs up to 256 simulations in parallel on one computer, so experiments finish in minutes instead of hours.

6
Get Verified Results

Every task has a deterministic judge that checks the final state - no guessing, just clear pass or fail results.

🎯 Research Ready

You've got a safe, scalable environment to study how AI agents interact with mobile interfaces - with results that transfer to real devices.

Sign up to see the full architecture

5 more

Sign Up Free

Star Growth

See how this repo grew from 29 to 29 stars Sign Up Free
Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose
AI-Generated Review

What is mobilegym?

MobileGym is a browser-based simulation environment for training and evaluating mobile GUI agents. Built with TypeScript and React, it replicates 28 popular apps like Alipay, WeChat, and Bilibili inside a virtual Android-like shell. The platform exposes the entire application state as structured JSON, allowing programmatic verification of agent actions instead of relying on slow, error-prone vision models. Researchers get a sandbox where agents can transfer money, send messages, and navigate complex flows without touching real accounts or funds.

Why is it gaining traction?

The key differentiator is deterministic verification. Traditional benchmarks fall back on VLM judges that misjudge outcomes roughly 10% of the time. MobileGym's state-based judges run in sub-milliseconds and catch unintended side effects like accidentally following a user. The platform also scales horizontally: you can run 256 parallel simulator instances on a single server using under 10% CPU, enabling fast iteration cycles for reinforcement learning experiments. Perhaps most compellingly, the team demonstrated that 95.1% of training gains transfer to a real Redmi Note 12 Turbo, bridging the sim-to-real gap that plagues most agent benchmarks.

Who should use this?

Researchers benchmarking mobile GUI agents will find the most value here. The pre-built task suite with 416 templates covers diverse scenarios, and the programmatic judge system eliminates the overhead of building custom evaluation pipelines. RL practitioners interested in online training for vision-language models on mobile tasks can leverage the parallel rollout infrastructure. App developers prototyping automation flows might also use it as a low-stakes testing environment, though this is secondary to the research focus.

Verdict

With a credibility score of 0.949999988079071% and only 29 stars, MobileGym is early-stage but backed by a detailed arXiv paper and validated sim-to-real transfer numbers. The documentation is thorough, the modular app architecture makes extension straightforward, and the dual Apache 2.0 / CC BY-NC 4.0 licensing separates reusable code from research content. Worth evaluating now if you're serious about mobile agent research, but watch for maturity as the community grows.

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.