ZJU-REAL / KnowU-Bench
PublicOfficial code for "KnowU-Bench: Towards Interactive, Proactive, and Personalized Mobile Agent Evaluation"
KnowU-Bench is an interactive benchmark for evaluating AI agents that perform personalized and proactive tasks on simulated Android phones.
How It Works
You find this benchmark on GitHub or a research paper and get excited to test how well AI assistants understand personal phone habits.
You install a few simple tools so your computer can run virtual Android phones.
With one command, you launch several realistic Android phones ready for testing.
You pick an AI assistant and select tasks like daily routines or personal preferences to evaluate.
Your AI takes over the phones, making decisions based on user habits, asking questions when needed, and handling real interactions.
You open a web viewer to see step-by-step actions, screenshots, scores, and detailed reports.
You discover exactly how well your assistant knows you, spots weaknesses, and gets ideas for improvements.
Star Growth
Repurpose is a Pro feature
Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.
Unlock RepurposeSimilar repos coming soon.