icip-cas

Towards Real-world Human Behavior Simulation: Benchmarking Large Language Models on Long-horizon, Cross-scenario, Heterogeneous Behavior Traces

28
2
100% credibility
Found Apr 12, 2026 at 28 stars -- GitGems finds repos before they trend. Get early access to the next one.
Sign Up Free
AI Analysis
AI Summary

OmniBehavior is a research benchmark dataset of real user interactions across multiple scenarios on a short-video platform, designed to evaluate AI models on simulating long-term human behaviors.

How It Works

1
๐Ÿ” Discover OmniBehavior

You stumble upon this fascinating project while searching for real-world human behavior data for your studies.

2
๐Ÿ“– Explore the details

You read the welcoming page and learn how it captures everyday actions like watching videos, shopping, and chatting on a popular app.

3
๐Ÿ“ฅ Grab the sample data

You download the free demo file showing a full 90 days of one person's real activities across different app features.

4
๐Ÿ” Dive into the stories

You open the sample and see a timeline of what the person did, when, and why it matters for understanding habits.

5
๐Ÿ’ก Apply it to your work

You use this rich example to test ideas on simulating human-like behaviors in AI projects.

๐ŸŽ“ Boost your research

Your studies level up with authentic insights, and you look forward to the full dataset release.

Sign up to see the full architecture

4 more

Sign Up Free

Star Growth

See how this repo grew from 28 to 28 stars Sign Up Free
Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose
AI-Generated Review

What is OmniBehavior?

OmniBehavior benchmarks large language models on simulating real-world human behaviors, using long-horizon traces across scenarios like video browsing, live streaming, e-commerce, ads, customer service, and search. It tackles the challenge of evaluating LLMs for heterogeneous, cross-scenario user actions from actual Kuaishou app data, providing a JSON demo now and full dataset in 2026. Developers get structured behavior histories with timestamps, contexts, and profiles for testing behavior prediction.

Why is it gaining traction?

Unlike synthetic datasets, it draws from genuine 90-day user sessions on a major platform, enabling cross-domain analysis rare in behavior benchmarks. The hook is its ground truth for long-term simulation, fitting the wave of towards real-world AI GitHub projects on human language understanding and heterogeneous traces.

Who should use this?

AI researchers evaluating LLMs for user simulation in recsys or agentic systems. Recommendation engineers analyzing cross-scenario patterns like video-to-purchase flows. Academics benchmarking towards real-world generalizability in behavior modeling.

Verdict

At 28 stars and 1.0% credibility, it's immatureโ€”just a solid paper, website, and demo with no full code or data yet. Track it for 2026 if long-horizon human traces matter, but skip for immediate needs.

(178 words)

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.